John Carmack, the consulting CTO for Meta’s virtual reality efforts, is leaving, according to two people familiar with the company. His exit came on Friday.
Carmack posted about his decision to leave on the company’s internal Workplace forum. He was openly critical of Meta’s advancements in VR and AR, which are core to its metaverse ambitions. He later posted the entire post on his Facebook profile.
While the company is transitioning its focus to the metaverse, Carmack said that Meta is running at half its effectiveness. He added that the company has “a ridiculous amount of people and resources” only to “squander effort and self-sabotage.”
“It has been a struggle for me. I have a voice at the highest levels here, so it feels like I should be able to move things, but I’m evidently not persuasive enough,” Carmack said in his post.
In 2014, Oculus was acquired by Facebook, the leading virtual reality company, for about $2 billion. Carmack was one of the driving forces behind the development of Meta’s virtual reality headsets.
ETL stands for Extract, Transform, and Load, and the acronym is an umbrella term for the process of collecting, transforming, and storing data at a specified location to accomplish a business goal. The process is accomplished using specially designed ETL tools. Depending on the volume and complexity of data and the number of queries required, enterprises can either purchase them or use open-source ETL tools. But first, it is necessary to know what ETL tools do.
Extract: In the first data processing step, the ETL tools “extract” or collect data from the desired location. The tools also recognize the data storing technique, security controls and then issue queries to read and see if there has been a change since the last extractions.
Transform: ETL tools alter the extracted data to make it appropriate for the target location where it will be loaded. The tools may change certain information in table cells, add/delete a few rows/columns to maintain consistency, and interact with different applications to do so, depending on the queries.
Load: After transforming the data, the ETL tool loads it in the desired location. Most of the time, the location is a data lake or a warehouse for analysis. The ETL also optimizes the loading process for maximum efficiency, bulk loading, and minimum loading time.
This article enlists some of the best open-source ETL tools.
Top 10 Open-source ETL tools
Listed below are some of the most useful opensource ETL tools, have a look.
Jaspersoft ETL
Jaspersoft ETL is a powerful, open-source, and versatile tool powered by Talend. The tool comes under the umbrella company TIBCO’s product portfolio and is specially designed for seamless data integration with volumes of complex data. Developers can graphically plan, schedule, and manage data workflows and transformations to load any target location, like Operational Data Store (ODS), Data Mart, or Data Warehouse. Once the data is loaded, it can be used for centralized reporting and advanced analytics. Jaspersoft ETL also offers a Community Edition with over 500 connectors and components version control and an Enterprise Edition with embeddable web reporting and self-service BI tools.
CloverDX (CloverETL)
CloverETL was the first of many open-source ETL tools developed when data warehousing started gaining momentum. Since then, CloverETL has dramatically improved as data has become progressively complex. The company currently offers a global service, a flexible data integration platform, and powerful support and services teams that actively aid enterprises in their data operations. Over the years, the company has proceeded to CloverDX, an entire “Data Experience,” with a holistic approach and flexibility. With CloverDX, enterprises can leverage multiple data management software while automating the entire ETL process. Nearly every data source or output can be connected using CloverDX. Additionally, it breaks down data silos, prevents vendor lock-in, and can customize connections specific to your business requirements.
Apache NiFi
Next on our list of opensource ETL tools is Apache NiFi, a robust and powerful ETL tool specially developed to upvote and seamlessly leverage the host system’s capabilities. It helps process, distribute, route, transform, and mediate system data. NiFi leverages a web-based user interface that allows users to switch between design, control, feedback, and monitoring. NiFi can establish dataflows both visually and in real-time. Any changes you make to the data flow take effect right away.
Additionally, it has an extensive configuration with low latency, runtime, dynamic prioritization, and back pressure control for enhanced efficiency. These configurations can be customized and extended to multi-tenant authorization, standard protocols, and strategies.
Scriptella ETL
Scriptella is yet another ETL and script execution tool available as an open-source tool. Launched by Apache, the tool is scripted in Java and can be used for executing scripts written in JavaScript, JEXL, Velocity, and many more. Unlike other opensource ETL tools, Scriptella is compatible with all cross-database ETL operations and provides a developer-friendly experience as it is interoperable with LDAP, JDBC, XML, etc. Unlike many other ETL tools, no prior knowledge of SQL (or any other extensive programming language) is required for basic ETL operations, making it very convenient for beginner and intermediate-level developers.
Jedox ETL
While all other opensource ETL tools focus on accomplishing the process, the top-of-the-line Jedox ETL focuses on strategizing, investigating, covering, and monitoring performance during extraction, transformation, and loading. With its powerful data integration and preparation tool, developers can import and extract vast amounts of data from any source. Jedox also provides a user-friendly web-based interface for visual data modeling, enabling non-technical users to undertake more complex projects.
Jedox Integrator offers preconfigured interfaces to all well-established relational databases and ERP/finance, CRM, HCM, and SCM applications. Any additional cloud or on-premises data source can be integrated using flexible connections, providing seamless authentication using a standard interface.
A production-ready ETL platform, KETL, is open, multi-threaded, and built on an XML-based architecture. KETL allows the management of complex data, scheduling, and ETL activities by leveraging an advanced data integration platform. The multi-threaded engine comprises several job executors, each of which performs a specific function. These executors can mainly perform actions falling under three categories, SQL, XML, and OS. KETL also provides additional support for other jobs via KETI API. All kinds of data, including relational, flat files, XML data sources, and proprietary database APIs, are supported by KETL. Data integration and time/event-based scheduling require no additional third-party dependency.
GeoKettle
GeoKettle is a potent, metadata-based ETL tool that integrates data from several sources to build and upkeep geospatial databases. It is a “spatially-enabled” version of Pentaho Data Integration software, formerly Kettle. With GeoKettle, users can extract data, transform it to fix errors, clean it, change its structure, make it consistent with standards, and then load the modified data into a GIS file, a target DBMS, or a geographic web service. This ETL service is mainly used for automating repetitive jobs without code. Due to its functionalities and read/write support for numerous file formats, services, and DBMS, GeoKettle is dependable, quick, standards-compliant, and reliable, making it one of the best opensource ETL tools.
Apache Camel
Another open source ETL tool by Apache, Camel, is an integration framework that enables users to integrate multiple systems consuming and producing data. It is one of the standalone opensource ETL tools that can also be embedded as a Spring Boot or Quarkus library. Camel is compatible with most standard integration patterns and keeps evolving to cover the newer patterns. It leverages multiple EIP patterns for data transformation and routing. Additionally, with support for several industry-standard formats from the financial, telco, healthcare, and other sectors, Camel supports about 50 data types. Recently, Apache Camel 3.19 was released with several features and significant improvements.
Singer
Singer is one of the most potent opensource ETL tools that seamlessly facilitates data extraction and loading. Stitch, a fully managed data pipeline, sponsors Singer. With Stitch, you can automate monitoring and alerting while running Singer taps on schedule and streaming the data to any target location. Singer describes data extraction via scripts called “taps” and data loading with scripts called “targets.” These taps and targets communicate data movements from any desired source to the destination. Taps extract data and output it in a JSON format, while Targets consume data extracted by Taps and load it in a file/API/database. Singer is also available on GitHub for everyone to access for free.
Matillion
The last one on our list of opensource ETL tools is Matillion. Matillion is an advanced ETL service and a part of a modern data stack designed for cloud-agnostic enterprises to help them manage day-to-day business data operations. Users can collect data from any source using its connectors and pipelines. Matillion simplifies pipeline management by leveraging batch loading from a single control panel. With Matillion’s lifetime free basic plan, enterprises can seamlessly integrate with Facebook, Gmail, Google BigQuery, Intercom, Azure SQL, LDAP, and many others to gather and analyze data. For more advanced features, Matillion offers paid plans depending on business needs.
Many people use Microsoft Excel or Google Spreadsheets daily to identify trends, organize data, and sort it into meaningful categories. But these tools are less intuitive for the general public when it comes to curating data presentations. Compared to more specific products like Microsoft Word, users would need prior knowledge before getting started with advanced excel or spreadsheets. This is because there are so many options, codes, and formulas to choose from, making it challenging to master the tool and ease your workflows. However, there are specific artificial intelligence tools and bots that can help. These bots intend to alleviate the challenges of creating an excel sheet, especially one with numerous formulas.
This article enlists some potent excel formula bot alternatives. Have a look.
List of useful Excel Formula Bots
Here is a list of some Excel Formula Bots, have a look.
Excel Formula Bot
Excel Formula Bot is probably the most renowned AI bot that generates formulas from input.
The Excel Formula Bot comprises an input field where the user describes what is needed. The website generates results based on the prompts and showcases an example for users to understand how to give inputs.
Developing formulas with an excel formula bot is as easy as seen in the above example.
QRS Toolbox for Excel
To save time while working with Excel spreadsheets, QRS Toolbox is an excellent excel formula creator that provides custom functions to write short and standard formulas. Without any requirements of complex computing and VBA codes, the excel formula bot is available as an add-in and can directly process data within Excel, reducing software dependence. As an alternative to the excel formula bot, the QRS toolbox is the only publicly available add-in that fits all Pearson and Johnson distributions and other custom techniques not commonly found in other software.
Excel CoPilot
Excel CoPilot is an excellent excel formula bot alternative that can save numerous hours every week. The bot utilizes artificial intelligence to generate complex spreadsheet formulas precisely. It has been trained explicitly on millions of lines of text and code, eliminating the need for users to work through codes to generate formulas. The AI-powered formula bot works on text-to-formula principles and natural language and is available as a free chrome extension.
Publisheet
The first excel formula bot alternative, Publisheet, is an Excel add-in that enables users to have a dynamic cloud-hosted worksheet experience. The add-in comes with formula support without any necessary coding requirements. It is an efficient tool for converting spreadsheets into web pages from within Excel. The tool is compatible with Excel 2016 or later and the online version of Excel. Publisheet also allows users to create custom reports accessible via a public URL.
Lumelixir or Onetap.ai
Lumelixir, also called OneTap.ai, is one of the best AI-based excel formula bot to get rid of googling formulas while working with spreadsheets. People behind Lumelixir believe that time is the most essential element. Lumelixir can help by immediately generating complex formulas within seconds. When given an input to “Find and replace ‘specific’ with ‘set’ from Column AQ,” Limelixir will give output formulas as “=find(“specific”,AQ)&replace(“specific”,”set”,AQ).”
Instead of using the excel formula bot online, using the Formula Builder from Daniel’s XL Toolbox can help you generate formulas by automatically collecting the cell references. The formula builder only requires four cell ranges: input groups (group names), input data (cells that contain data), output groups (groups for which a formula is needed), and output formulas (cells where the formula builder writes the formula). Once these inputs are provided, the formula builder will generate the desired output.
As seen in the above images, the excel formula bot will work out the input groups and input data to generate the sorted output groups and the formulas.
Web scraping is a mechanized technique to extract massive amounts of data from websites. The bulk of this data is semi-structured in HTML format and is later transformed into structured data, stored in a database or spreadsheet so that it can be used in multiple applications. Although web scraping can be done manually, automated methods using specific web scraping tools are typically preferred since they can be less expensive and perform more quickly.
Web scraping, however, is typically not an easy operation. Initially, the scraping tool will be fed with one or more URLs. Then the scraper loads the entire HTML code for the requested page and extracts all the data or specifics. Some advanced web scraping tools can render the entire webpage, including CSS and Javascript elements. Finally, the web scraper will export all the acquired data in a more user-friendly format. Most online scrapers output data to Excel spreadsheets or CSV, but more complicated ones support other formats like JSON.
Because of digitization, websites have gathered massive amounts of data, and web scraping techniques have gained popularity. Web scraping tools differ in functionality and features since website data comes in various kinds and sizes. Some of the standard data scraping techniques that the best web scraping tools use, are:
HTTP programming
HTML parsing
DOM parsing
Semantic Annotation
Computer Visions Web-page analysis
Best Web Scraping Tools you Should Try
Since every web crawler is different, choosing the right one can be challenging. This article describes what is a website scraper and compiles a list of some of the best web scraping tools.
Scraper API
Web scraping is a complex procedure, but Scraper API, one of the best web scrapers, simplifies it by handling proxies, browsers, and CAPTCHAs. Scraper API has built multiple web scrapers and repeatedly went through setting procedures to build use-specific web scrapers. There are many web scrapers based on the data one needs to extract. Nevertheless, all APIs work similarly, and so does Scraper API. The user requests a particular source. This request is received by the scraper API, which connects to the target system using its specifics. It then extracts data from that system and gives it back to you for processing and storing for later use or immediate usage.
This webpage scraper offers a new Async Scraper endpoint that enables web scraping jobs at scale without any timeouts, making data scraping more resilient. The API offers seamless integration with NodeJS, NodeJS Puppeteer, and Cheerio.
You can avail of 5000 free API credits with their 7-day trial service post, which you can use for US$49/month. For more pricing details, you can visit the website.
Smartproxy SERP Scraping API
Web scraping the Google Search results pages can be tedious as Google does not allow it. Moreover, scraping at a rate higher than eight keyword requests/hour risks your detection, and more than ten keyword requests will result in blocking. An excellent solution to this is offered by the SERP scraping API, the best web scraping tool from Smartproxy. Web scraper, data parser, and a sizable proxy network are all combined in the API.
This full-stack tool for web scraping allows users to send a single, successful API request to retrieve structured data from the most significant search engines. The search engine proxies from Smartproxy can be used for everything from monitoring prices to retrieving paid and organic data to examining keyword ranks and other SEO metrics in real-time.
Smartproxy scarper service offers multiple plans based on your requirements for proxy requests. There are four plans, Lite, Basic, Standard, and Solid. Additionally, Smartproxyoffers enterprise-level plans for more complex results. For more details, you can check their pricing page.
ParseHub
The previous generation of scraping tools was based on codes and hours of coding. To make web scraping tools more precise and save coding time, no-code development platforms likeParseHub have come into the picture.
With this web scraping tool, users can create their data extraction workflows without programming knowledge. ParseHub can manage all source code element selection and neighbor element prediction independently.
Data scrapers like ParseHub offers a FREE version without any credit card requirement using which users can extract 200 pages per run within 40 minutes. ParseHub offers Standard, Professional, and Enterprise-level plans to offer better services and more pages. Check theirpricing page for more details.
Web Scraping using Beautiful Soup
The above-mentioned data scraping techniques are third-party scraping tools. You can also scrape data manually by utilizing open-sourced libraries and codes. Beautiful Soup is a Python library that extracts data from HTML, XML, and other similar formats. Simply, it helps users to pull specific content from a webpage by removing the HTML markup and saving information. The library can be used to isolate titles, links, and texts from HTML tags and alter HTML within the document.
To scrap data, the user sends an HTTP request to the target URL. Once the access is granted, data has to be parsed using an HTML parser, like html5lib, that creates a nested data structure. Finally, the last step is to navigate and search the parse tree using Beautiful Soup.
It is a no-cost way to extract data from web pages. Install all third-party libraries, requests, html5lib, and bs4 using the pip command and follow the steps to scrap data.
Octoparse
Octoparseis a cloud-based web data extraction tool that helps scrap data from various websites. Users can use it to scrape product comments, reviews, social media channels, and other unstructured data and save it in different formats, including HTML, Excel, and plain text. Octoparse is capable of running multiple extraction tasks simultaneously. These tasks can be scheduled in real-time or at regular intervals.
Octoparse offers two customized modes, the Wizard Mode and the Advanced Mode. The Wizard mode provides step-by-step instructions for scraping data, while the Advanced mode offers features for more complex web pages. Additionally, the IP rotation feature prevents XML/API blockages from suppliers.
Octoparse provides services that include email and online knowledge base help on a monthly subscription basis. The free plan has a cap of 10.000 records per export and a low number of concurrent crawlers and runs. For more information, refer to the pricing page.
Most websites that display lists of information do so by querying a database and intuitively presenting the data. This procedure is reversed by a web scraper, which takes unstructured websites and converts them back into a database. Helium Scraperis a web scraper thatfocuses on the kind of data to be extracted and not on how to extract it.
It offers software for web scraping using multiple off-screen Chromium browsers, presents a simple interface, and integrates web scraping and API calling into a single project. The web scraper tool also supports JavaScript code and function generation to match, split or replace extracted text.
For new users to get started, there is a 10-day trial available. Later, with a one-time purchase, users can buy the software for a lifetime. For more information, refer to the pricing page.
Apify
Apify is an automation, data extraction, and web scraping platform. With Apify, users can create an API with an integrated data center and residential proxies for extraction. For web pages like Instagram, Facebook, Twitter, and Google Maps, Apify Store offers ready-made scraping solutions. Developers can build customized scraping tools for others websites while Apify handles infrastructure and payment.
Apify offers shared IPs and seamless integration with Keboola, Transposit, Airbyte, Zapier, and other similar platforms. It supports programming languages like Selenium, Python, and PHP.
Apify provides 1000 no-cost API requests. For more requests, the plans start at US$49/month and come at a 20% discounted value with yearly payments. For more information, refer to the pricing page.
Zenscape API
Despite numerous online scraping solutions, Zenscrape is one of the most reliable data scrapers. It meets your requirements and does web scraping on a big scale while resolving any problems. It is another online web scraper tool with no coding requirements. With Zenscape, users can extract data from any website having anti-scraping measures by its IP rotation, CAPTCHA solving, and other features.
Zenscape provides a user-friendly interface, JavaScript rendering and supports many front-end frameworks like JQuery, Vue, and React. Additionally, Zenscape does not limit the number of Queries Per Second, and every request is allotted to a unique IP address.
Zenscape offers a lifetime free plan for US$0, a Small plan for US$24,99, a Medium for US$79,99, and a Large for US$199,99. For more information, refer to the pricing page.
Import.io
There are plenty of ways to scrape data and mine information from a website. One of the numerous services that intends to streamline the scraping process is Import.io. Import.io is an e-commerce platform that helps enterprises create more innovative analytics and offers web scraping assistance. It leverages a no-cost and convenient data scraping service, even for websites that employ JavaScript and display results over numerous pages.
Users can download, install, and launch Import.io for Windows, OS X, and Linux by going to the website. Then create an Import.io account, which can be done for free up to 250,000-page calls each day, or sign in with GitHub, Google, or LinkedIn accounts. Click here to learn more about the prices.
Sequentum Content Grabber
Sequentum Content Grabber is yet another low-code web data extraction tool that automates the extraction process by adapting to recurrent data, code, and environment changes. The scraper tool is aimed at enterprises that wish to reduce their coding labor and time by creating stand-alone web crawling agents.
The end-to-end data extraction platform can be used in-house and outsourced for web data. For web data extraction, document management, and intelligent process automation, this tool for scraping offers total control (IPA). Users will be able to create scripts or debug the crawling process programming using C# or VB.NET. Almost any website’s content can be extracted and saved as structured data in the desired format.
The annual enterprise license starts at US$15,000. To scale their operations, some enterprises may require additional licenses, which can be added for additional costs. Refer to the main website for prices.
Infinity AI, the synthetic data generation startup, has recently raised $5M in a seed funding round led by Diana Kimball Berlin at Matrix to build faster AI models using synthetic data. The founders and operators from the companies like Tesla, Snorkel AI, and Google also participated in the round.
The company noticed that AI models are only as good as the data they have been trained on. So, data collection is one of the main challenges in making better AI models. According to Infinity AI’s studies, many data scientists spend 80% of their time gathering, organizing, and labeling AI training data. As a result, many AI projects do not lead to production.
According to Infinity, the training data collection problem can be solved using synthetic data. It allows users to upload a single real video on its platform and transform it into hundred of perfectly labeled synthetic videos.
Over the past two years, several companies have relied on synthetic data to solve the training data collection problem and enhance their AI and machine learning models, including Tesla, Amazon, and Microsoft.
Infinity AI stated that the accuracy of the AI model is directly correlated to its training data. The process of collecting real-world data is very time-consuming and expensive. After the data is collected, it has to be correctly classified and annotated before using it for training. Therefore, many organizations are moving towards synthetic data, especially if the data acquisition and training costs are limited.
retrain.ai, the leading talent intelligence platform, has been selected as one of the Globe’s most promising Israeli startups of 2022, nominated by Israel’s premier venture capital firms on Thursday. The company retrain.ai helps train your enterprise to attract, hire, retain employees and upskill your workforce powered by artificial intelligence and real-time labor market data.
retrain.ai also reported closing an additional $14 million investment round led by AI-focused Radical Ventures, raising their total funding to $34M. Other investors like Square Peg, Hertz Ventures, .406 Ventures, Schusterman Family Investments, TechAviv, and Splunk Ventures also participated in the round.
Globes is Israel’s number one economic newspaper, which published the list of 10 promising companies based on leading global investment funds in the Israeli high-tech market. It analyzed about 4000 tech companies, out of which retrain.ai was recognized for its cutting-edge AI technology and capabilities to help people gain in-demand, emerging skills necessary for a productive career.
The Co-founder and CEO, Dr. Shay David, Retrain.ai, mentioned that the company is highly honored to be recognized by Globes among all the other thousands of techs surveyed. Radical’s investment in retrain.ai will encourage immediate value and growth in the international market, where retrain.ai will help millions of people find jobs.
Biochemists from the Netherlands’ Cancer Institute have presented AlphaFill, an upgraded version of DeepMind’s AlphaFold, a protein sequencing and folding model. AlphaFill is an algorithm that predicts protein structures by transplanting missing ions and molecules from previously determined protein sequences and structures.
The building blocks of life forms, Proteins, can be thought of as a ribbon of amino acids that wraps up into a knot of complicated twists and turns. Because amino acids are naturally flexible, a typical protein can assume an estimated 10 to the power of 300 different configurations. DeepMind claimed that it had predicted more than 200 million proteins using AlphaFold.
Following the footsteps, Meta released another protein folding and sequencing model called ESMFold, which is known to work approximately 60x faster than AlphaFold. However, none of the currently available models provide small molecule coordinates for structure or function.
While AlphaFold and ESMFold models almost perfectly predict domain structures, they are less accurate in predicting flexible protein areas like loops or intrinsically disordered regions. AlphaFill is able to cover these gaps by filling the ‘missing’ elements of a protein structure.
To validate its function, AlphaFill, for protein folding, was compared to other experimental structures to generate a databank called the AlphaFill databank. The repository aims to assist scientists in developing more hypotheses for protein folding. You can read more about the algorithm in the research paper.
Former US President Donald Trump has launched a US$99 digital trading card NFT collection via his social media website Truth Social. The NFT collection will have over 45,000 fantasy NFTs minted on Polygon and features images similar to collectible baseball cards.
Shares of Digital World acquisition saw a 10% surge after Trump’s teaser of a major announcement. The major announcement was supposed to collect shareholder votes for the merger of Digital World (a special-purpose acquisition company) with Trump’s Truth Social. But as the announcement turned out to be the launch of US$99 digital NFTs, it failed to gin up the excitement.
Trump posted on Truth Social, “These limited edition cards (Donald Trump NFTs) feature amazing ART of my Life & Career! Collect all of your favorite Trump Digital Trading Cards, very much like a baseball card, but hopefully much more exciting…“ He added that these cards, priced only at US$99, would make a fabulous Christmas gift.
The NFT cards can be purchased from here, and the collectors who purchase the Trump trading cards will automatically enter a “sweepstakes” to receive experiences with Trump. These may include a zoom call, a dinner in Miami, or cocktails at Mar-a-Lago.
Accenture announced that it has become an inaugural member of the Corporate Affiliate Program at Stanford Institute for human-centered artificial intelligence (HAI). The center focuses on building a responsible AI-driven future and will help Accenture advance its commitment to the domain.
The HAI Corporate Affiliate Program will provide opportunities for member companies to work closely with Stanford faculty and students to find solutions compatible with AI, policy, industry, and education.
Fei-Fei Li, co-director of the institute, said, “I’m particularly looking forward to seeing the fruit of the AI research projects envisioned and engaging Accenture’s ecosystem in HAI’s critical mission relating to education and human-centered AI.”
Accenture is a global cloud and technology leader who embraces the power of change and works to add value in more than 40 industries in over 120 countries. With almost 721,000 people onboard, it offers unmatched experience, strategy, and consulting services via its network of advanced technology and intelligent operations centers.
With HAI membership at Stanford, it will sponsor a number of academic research initiatives focusing on AI trust and safety in the first year of the program. Additionally, it will collaborate with graduate students at the beginning of Stanford’s winter term to conduct ethical AI-focused research.
Protect AI, an AI-driven security company, raised over US$13.3m in a seed funding round co-led by Acrew Capital and Boldstart Ventures. The expanded investment will be used for product development and enhancing customer engagement with Protect AI’s security solutions.
Protect AI is one of the few security companies focusing entirely on developing security tools for AI systems and machine learning models that prevent malicious attacks and exploits. Its product line intends to assist developers in locating and resolving security flaws at various machine learning life cycle phases.
Ian Swanson, CEO of Protect AI, said that as AI advances and is applied to newer use cases, organizations require more robust security systems to recognize and combat threats surrounding their machine learning code. He added, “We have researched and uncovered unique exploits and provide tools to reduce risk inherent in [machine learning] pipelines.”
There is no evidence of an increase in the intensity and frequency of AI cyberattacks, but Swanson believes that prevention is the key, as the predictions say otherwise. He said many code security solutions are incompatible with open-source coding environments like Jupyter Notebooks.
These vulnerabilities arise due to a lack of innovation on the part of security service providers, and Protect AI covers this gap. The company’s first product, NB Defense, is a security plugin designed for Jupyter Notebook. NB Defense scans Jupyter notebooks for any threats and provides workaround suggestions.
According to Swanson, Protect AI seed funding will be used to work with prominent AI development tools such as Amazon SageMaker, Azure ML, and Google Vertex AI Workbench in addition to Jupyter Notebooks.