Google has unveiled a research paper detailing an advanced algorithm that extracts “services offered” from local business websites. This information is then added to business profiles on Google Maps and Search, enhancing visibility and relevance.
A notable contributor to this research is Marc Najork, a distinguished scientist at Google, known for his work in information retrieval and artificial intelligence. This system aims to streamline the process for users seeking local services. The paper, dated 2023, was published in 2024 according to the Internet Archive.
The research paper explains:
“…to reduce user effort, we developed and deployed a pipeline to automatically extract the job types from business websites. For example, if a web page owned by a plumbing business states: “we provide toilet installation and faucet repair service”, our pipeline outputs toilet installation and faucet repair as the job types for this business.”
BERT Powers the System
Google employs the BERT language model to classify phrases from business websites, determining their relevance as job types. BERT is fine-tuned with examples and additional context, such as website structure and business category, to enhance accuracy.
Building the Local Search System
The system’s development began with creating training data from billions of home pages listed in Google business profiles. Job type information was extracted from tables and lists on these pages. This data served as the foundation for expanding job type keyword phrases.
Addressing Relevance Issues
Initially, the system faced challenges as many pages mentioned job types unrelated to actual services. By incorporating surrounding sentences, Google improved the system’s ability to discern the context of job type phrases.
The research paper explains:
“We found that many pages mention job type names for other purposes like giving life tips. For example, a web page that teaches readers to deal with bed bugs might contain a sentence like a solution is to call home cleaning services if you find bed bugs in your home. They usually provide services like bed bug control. Though this page mentions multiple job type names, the page is not provided by a home cleaning business.”
SEO Insight
The algorithm focuses on job type keyword phrases and their context, highlighting the importance of surrounding text for understanding page content without processing the entire page.
Expanding Beyond Local Business
The system’s methodology can be applied to other domains, such as expertise finding, legal, and medical information extraction. This adaptability showcases its potential in various fields.
They write:
“The lessons we shared in developing the largescale extraction pipeline from scratch can generalize to other information extraction or machine learning tasks. They have direct applications to domain-specific extraction tasks, exemplified by expertise finding, legal and medical information extraction.
Three most important lessons are:
(1) utilizing the data properties such as structured content could alleviate the cold start problem of data annotation;
(2) formulating the task as a retrieval problem could help researchers and practitioners deal with a large dataset;
(3) the context information could improve the model quality without sacrificing its scalability.”
Algorithm Success
The algorithm has proven successful, offering high precision and scalability. It has been operational for over a year, delivering accurate results for Google Search and Maps users.
The researchers write:
“Our pipeline is executed periodically to keep the extracted content up-to-date. It is currently deployed in production, and the output job types are surfaced to millions of Google Search and Maps users.”
Key Takeaways
- Google’s Job Type Algorithm
Extracts services from business websites for Google Maps and Search. - Effective Content Extraction
Reads free-text content, effective even when services are buried in paragraphs. - Context Matters
Evaluates surrounding words to confirm relevance, improving accuracy. - Versatile Application
Can be applied to fields like legal or medical information extraction. - Proven Precision
Operational for over a year, delivering scalable, high-precision results.
Google’s research showcases an algorithm that enhances local business listings by extracting service descriptions from websites. This method, which doesn’t rely on HTML structure, can be adapted for other industries needing information from unstructured text.
Explore Related Services by Cyberset:
- Social Media Marketing
- Content Marketing
- Email Marketing
- Website Design
- Search Engine Optimization
- Local Internet Marketing
- Professional Custom Website Development
- Ecommerce Website Development
- WordPress Web Design
- Pay Per Click Marketing
Read the research paper abstract and download the PDF version here: