Why Researchers Are Turning to Small AIs on Their Laptops

The website histo.fyi serves as a comprehensive database focused on the structures of immune-system proteins, specifically major histocompatibility complex (MHC) molecules. It features images, data tables, and amino acid sequences. Bioinformatician Chris Thorpe, the site’s creator, leverages artificial intelligence (AI) tools known as large language models (LLMs) to turn this data into clear, readable summaries. However, instead of using web-based LLMs like ChatGPT, Thorpe runs the AI locally on his laptop.

Over the past few years, LLM-based chatbots have gained popularity for their ability to craft poetry and hold conversations. Some of these models, with hundreds of billions of parameters, offer significant complexity and are only accessible online. But two recent trends have changed the landscape. First, organizations are now offering ‘open weights’ LLM versions, allowing users to download and run models locally if they have the required computing power. Second, tech companies are creating smaller, scaled-down LLMs that can run on consumer hardware, performing on par with older, larger models.

Researchers are adopting these tools for various reasons—cost savings, data confidentiality, or ensuring reproducibility. Thorpe, based in Oxford, UK, and working with the European Molecular Biology Laboratory’s European Bioinformatics Institute in Hinxton, UK, is one of many researchers exploring these possibilities. He believes this trend will only grow. As computers become faster and models more efficient, more people will use AI on personal devices for most tasks. This shift will put powerful AI tools directly at scientists’ fingertips, offering access to the actual algorithms rather than just remote interfaces.

Big things in small packages

Several major tech companies and research institutions have launched small and open-weight models in recent years. Among them are Google DeepMind in London, Meta in Menlo Park, California, and the Allen Institute for Artificial Intelligence in Seattle, Washington. Although these models are described as “small,” they often contain around 30 billion parameters, which is still quite large compared to earlier models. While OpenAI hasn’t released open-weight versions of its GPT models, Microsoft has been active in this space.

In 2023, they released the small language models Phi-1, Phi-1.5, and Phi-2, followed by four versions of Phi-3 and three of Phi-3.5 this year. These Phi-3 and Phi-3.5 models range between 3.8 billion and 14 billion parameters, with the Phi-3-vision and Phi-3.5-vision models also handling images. By some benchmarks, even the smallest Phi model surpasses OpenAI’s GPT-3.5 Turbo, rumored to have 20 billion parameters.

Enhanced Performance and Practical Applications

Sébastien Bubeck, Microsoft’s vice president for generative AI, attributes the strong performance of the Phi-3 model to its training data. Typically, LLMs are trained by predicting the next “token” (or text unit) in a sequence. For example, to predict the killer in a murder mystery, the AI must “understand” the entire story, but such complex predictions are rare in most text. To address this, Microsoft had LLMs generate millions of short stories and textbooks where each part builds on the previous. According to Bubeck, this richer data set results in a model that can fit on a mobile phone while delivering the same power as the 2022 version of ChatGPT.

He emphasizes that a well-crafted data set with a high concentration of reasoning tokens significantly enhances the model’s capabilities. Additionally, Phi-3 excels at routing tasks—determining whether a query should be handled by a larger model. Bubeck also notes that small models like Phi-3 are useful for scientists in areas with limited cloud access. He mentions that while hiking in the Pacific Northwest, where network coverage can be sparse, he can use a small model to identify flowers via a simple image query.

Researchers can further customize these tools for specific purposes. For instance, Alibaba developed a series of models called Qwen, which range from 500 million to 72 billion parameters. A biomedical scientist in New Hampshire fine-tuned the largest Qwen model using scientific data to create Turbcat-72b, which is now available on the model-sharing site Hugging Face. Known only as Kal’tsit on the Discord messaging platform, the researcher fine-tuned the model to assist with tasks such as brainstorming, manuscript proofreading, code prototyping, and summarizing research papers. Turbcat-72b has seen thousands of downloads, reflecting its growing popularity despite the ongoing debate around AI-assisted research.

Preserving Privacy | ChatGPT

Kal’tsit emphasizes that beyond fine-tuning open models for specialized tasks, local models offer a significant advantage: privacy. Sending personally identifiable information to commercial services can pose risks, especially with data-protection regulations. She explains, “If an audit happens and you’re found using ChatGPT, it could lead to serious issues.”

Cyril Zakka, a physician leading the health team at Hugging Face, also relies on local models for generating training data for other models. In one project, he uses these models to extract diagnoses from medical reports so another model can predict those diagnoses from echocardiograms, commonly used for monitoring heart disease. In another, he generates questions and answers from medical textbooks to test other models. “We are moving towards fully autonomous surgery,” Zakka explains. He envisions a robot trained to answer questions, enabling better communication with doctors.

Cost-Effective and Flexible Local Models

Zakka prefers using local models like Mistral 7B, from the Paris-based Mistral AI, or Meta’s Llama-3 70B. They cost less than subscription services like ChatGPT Plus and offer flexibility for fine-tuning.

Privacy, however, is a critical factor. Patient medical records can’t be sent to commercial AI platforms. Similarly, Johnson Thomas, an endocrinologist at Mercy Health in Springfield, Missouri, values privacy. He notes that clinicians often don’t have time to transcribe and summarize patient interviews. However, many commercial AI services are either too expensive or not approved for handling private medical data. To address this, Thomas is developing an alternative. Using Whisper, an open-weight speech-recognition model from OpenAI, and Gemma 2 from Google DeepMind, his system will allow physicians to transcribe patient conversations into medical notes and summarize research data.

Privacy concerns extend to industry as well. Portrai, a South Korean pharmaceutical company based in Seoul, developed CELLama, a tool that utilizes local LLMs like Llama 3.1. It processes information about a cell’s gene expression and other characteristics into a concise summary, which is then converted into a numerical representation. This data helps in clustering cells into types. On their GitHub page, the developers emphasize privacy as a key feature, noting that CELLama “operates locally, ensuring no data leaks.”

Putting Models to Good Use

As the landscape of large language models (LLMs) evolves, scientists now have a rapidly growing set of options. Chris Thorpe, a bioinformatician, finds himself in the experimental phase of using LLMs locally. He initially tried ChatGPT but found it too expensive and wasn’t satisfied with its tone. Now, he relies on local models like Llama, running either 8 billion or 70 billion parameters, both of which work on his Mac laptop.

Thorpe points out another key advantage of local models: they remain consistent. Commercial models can update at any time, leading to changes in output and forcing him to adjust his prompts or templates. “In science, reproducibility is crucial,” he says. “You always worry when you’re not in control of the reproducibility of what you’re generating.”

In one of his projects, Thorpe is developing code to align MHC molecules based on their 3D structures. To test his algorithms, he needs a large and diverse set of proteins—more than nature provides. For this, he turns to ProtGPT2, an open-weights model with 738 million parameters trained on approximately 50 million sequences, to help design plausible new proteins.

However, Thorpe acknowledges that local models aren’t always the best solution. When it comes to coding, he uses GitHub Copilot, a cloud-based tool, as his go-to assistant. “It feels like I’ve lost a vital tool when Copilot isn’t available,” he admits. While there are local coding tools like Google DeepMind’s CodeGemma or one developed by Continue in California, Thorpe believes they still can’t match the performance of Copilot.

Access Points

Running a local LLM is straightforward with the right tools. Ollama, available for Mac, Windows, and Linux, allows users to download open models such as Llama 3.1, Phi-3, Mistral, and Gemma 2, and access them via the command line. Alternatively, you can use cross-platform apps like GPT4All and Llamafile. These tools can transform LLMs into a single file compatible with any of six operating systems, with or without a graphics processing unit.

Sharon Machlis, a former editor at InfoWorld and a resident of Framingham, Massachusetts, has written a comprehensive guide on using local LLMs. Her guide covers a dozen options. “My first suggestion,” she says, “is to choose software that matches your comfort level with tinkering.” Some users may prefer the simplicity of apps, while others might enjoy the flexibility of the command line.

Stephen Hood, who leads open-source AI at Mozilla in San Francisco, believes local LLMs will soon be suitable for most applications. “The progress in the past year has been astounding,” he notes.

Ultimately, the choice of applications is up to the users. “Don’t be afraid to dive in,” advises Cyril Zakka. “You might be pleasantly surprised by the results.”

Also Read About –

UK AI Investment Set for Record-Breaking Year: A Deep Dive Into the Numbers and Trends

UK’s AI Sector at a Crossroads: Government Funding Cuts Raise Tough Questions for Startups

Hot topics

Finance

Marketing

Politics

Strategy

Hot topics

Finance

Marketing

Politics

Strategy

Beyond ChatGPT: Why Researchers Are Turning to Small AIs on Their Laptops

Big things in small packages

Enhanced Performance and Practical Applications

Preserving Privacy | ChatGPT

Cost-Effective and Flexible Local Models

Putting Models to Good Use

Access Points

Also Read About –

Topics

Related Articles

Company

Headlines

Newsletter