Introduction
An interesting statistic from Statista says there are around ~5 billion internet users globally, with around 65% of traffic from mobile devices and the rest from desktop and tablets. At the same time, the usage and growth of AI are also exponential. Considering such a huge user base for websites and browsers, it’s crucial for businesses and developers to understand how AI can be leveraged to serve internet users in a better way.
AI is already being leveraged by websites such as ChatGPT, Netflix, Amazon, and many more, but the compute required for AI tasks is done on a cloud service or a server. This approach has a few concerns, such as:
- Privacy threat: as personal data may be sent to the cloud for processing
- Latency: time taken for server roundtrip
- High cost: Cost required for cloud services
Browser-native AI helps in mitigating the above issues for certain AI tasks. Let’s explore what Browser-native AI is and how it can be utilized in websites.
What is Browser-native AI ?
Before understanding Browser-native AI, it’s important to understand what On-Device AI is. With the On-Device AI approach, the compute required for AI tasks happens locally using the AI models available within the user device, eliminating the need for the data to be sent to the cloud.
Similar to the above approach, with Browser-native AI, the AI models get downloaded into the user’s browser on demand for local computation. Browsers can make use of powerful JavaScript APIs such as WASM, WebGL, WebGPU to utilize the GPU or CPU of the user device for AI computation.
What powers Browser-native AI ?
Browser-native AI uses the user’s local device hardware for AI computation. Browsers provide access to user’s hardware through APIs such as WASM, WebGL, WebGPU. AI models can make use of these APIs to leverage user hardware compute such as CPU or GPU for inference.
A comparison of browser JS APIs which power Browser-native AI
Web API | Purpose | Browser support |
---|---|---|
WebGL | Allows direct use of device’s GPU. Primarily intended for 2D or 3D rendering, but can help in machine learning tasks as well. | Modern browsers: Chrome, Firefox, Safari, Edge |
WebGPU | Next-gen web graphics API (successor of WebGL), which leverages devices’ GPUs for graphics and machine learning operations. | Chrome, Edge |
WASM | Facilitates running languages such as C, C++, and Rust on the web with near-native performance. Helps AI models in CPU inference. | Modern browsers: Chrome, Firefox, Safari, Edge |
How to implement Browser-native AI ?
There are 2 approaches for powering your websites with browser-native AI:
1. Built-in AI APIs Chrome has launched built-in APIs in the browser for performing natural language tasks. Using these APIs is easy and straightforward; AI models used under the hood are managed by browsers with no effort from developer. Different expert models and foundation models are downloaded and cached in the browser based on the API. Below are some APIs which are stable and available from v138:
- Language detector API: Helps in language detection.
- Translator API: Translates text from one language to another (limited language support).
- Summarizer API: Summarizes lengthy text like articles, chat conversations into key points.
The following APIs are in the experimental stage and will be available soon for usage. They can be enabled in the Chrome browser by setting a flag (chrome://flags/#prompt-api-for-gemini-nan).
- Writer API: Helps users in writing new content such as blogs, emails, etc.
- Rewriter API: Helps in revising an existing text, such as changing structure, condensing a large text, etc.
- Prompt API: Provides the users with direct access to the Gemini nano model for prompting.
One important concern is that these APIs are not available in other browsers, cross browser support is an on-going process.
2. Open-source ML libraries
Open-source packages can be used in websites to run AI models locally in the browser. Below are the most popular JavaScript libraries which help in running AI models in the browser. In this approach, developers need to be a bit cautious about system requirements as different models may require different system requirements.
- Tensorflow.js: JavaScript machine learning library which aids in integrating out-of-the-box pre-trained models for running in web applications.
- Transformers.js: Helps running transformer-based models from the Hugging Face platform in the browser without a server.
Both libraries support AI tasks across modalities such as computer vision, natural language processing, and audio processing.
Benefits of Browser-native AI
Running AI tasks locally offers the following benefits:
- Privacy: Sensitive user information is not sent to the cloud/server but processed locally, ensuring reduced risk of leakage of private data.
- Offline usage: Once the model is downloaded and cached in the browser, AI tasks can be performed without internet connectivity. This capability can help progressive web apps in offline mode.
- Instant results: As there is no server round trip for compute, results can be delivered faster to the users.
- Cost savings: From the business perspective, eliminating cloud AI compute operations and API calls can reduce costs significantly.
Limitations of Browser-native AI
Though browser-native AI offers massive benefits for businesses and users, there are certain limitations which both should be aware of:
- Browser support: Built-in AI APIs are available only in Chrome. It may take some time for other browsers to adopt these APIs.
- Hardware requirement: As AI compute is happening locally, significant hardware specs might be required. Hence, B2C features which completely rely on local compute are not feasible as not all users may have mid to high-end machines.
- Not mobile ready yet: Chrome built-in AI is available only for desktop users.
Considering the above limitations, it’s better to take a hybrid approach where AI compute can fall back to cloud/server if browser-native AI support is not available on the user’s machine.
Caveats in using Browser-native AI
There are some caveats in both implementing and using browser-native AI.
From user’s perspective: Though user’s data is processed locally on the device, it still goes through the web interface to the AI model. So, there is a chance of privacy threat as the website may log the user input or prompt. Hence, users should be cautious about providing any sensitive inputs to the website.
From developer’s perspective: Irrespective of where the data is processed, be it in local or cloud, GDPR rules still apply. So, developers should responsibly adhere to the privacy laws in processing the user’s data for AI tasks. Also, transparent warning notes can be added wherever the user prompt is required for AI processing , so the user can be cautious. For example: Do not key-in any personal or sensitive information.
Conclusion
AI is continuously evolving and taking different forms drastically. Browser-native AI is one such form. Yes, there are browser and hardware limitations but devices with specs that meet AI requirements are inevitable in the future. Though features which completely rely on browser-native AI cannot be built immediately, developers can make use of this time to ideate and innovate new possibilities considering native integration of AI in all browsers.