Andrea Volpini, CEO of WordLift, is a visionary entrepreneur, now focusing on semantic web and artificial intelligence. Co-founder of Insideout10 and Redlink Andrea has 25 years of world-class experience in online strategies and web publishing. Nowadays, he provides tools and consulting services to global brands that need to grow online. To know more about Andrea and WordLift updates, follow his Twitter account.
1. How many years of experience in digital marketing do you have?
I have been working on the web in various roles since 1995. It’s over a quarter of a century, but believe me, I still get excited about new technologies and how people interact online.
2. What projects have you launched recently that you are proud of?
I am co-founder and CEO of WordLift. We build and host knowledge graphs to automate SEO.
3. In your opinion, will artificial intelligence solve SEO?
AI is already addressing many issues in SEO; the point with today’s narrow AI is how we envision human-AI interaction. Most systems are intermittent by design. For example, I analyze the queries on SEMRush, and the system extracts the type of intent by classifying the queries and sends me back the result, the flow is completed. It works but only helps in some scenarios. SEO is indeed a continuous process, so our focus is on building ongoing human-AI interactions where data is stored in a knowledge graph. As we iterate and improve specific pages, the AI improves and gives us back better results. Practically speaking, SEO areas where we’re mostly focused on, in terms of AI automation, are:
- Scouting of new search opportunities: by building a knowledge graph from the SERP behind a query, we can detect the tendencies in terms of entities we expect to mention in our content. We are working to refine how entities in these graphs are extracted and clustered.
- Structured data automation: WordLift’s NLP extracts entities from web pages to refine the markup. This generally means more scalable and more accurate structured data markup throughout the site. Marked-up entities are also interlinked automatically with DBpedia and Wikidata.
- Natural Language Generation (NLG) for content generation: we primarily focused on small text snippets such as product descriptions, introductory text for category pages, and FAQ.
- Content Recommendation: we have constantly enriched web pages by adding links, context cards, and content widgets to improve the user experience. The more we produce content for SEO, the more it gets critical to help the user find it. We’re also actively working on multimodal and semantic search experiences in this area.
4. What are knowledge graphs in the context of the Semantic Web?
In the Semantic Web context, knowledge graphs are described as RDF graphs made of RDF triples. Every triple consists of a subject, a predicate, and an object. A knowledge graph is designed for integrating information and typically focuses on instances rather than on the schema of the represented knowledge. In SEO, we primarily use schema.org as core vocabulary and focus on creating Content Knowledge Graph to describe the content we publish online to search engines and intelligent agents like chatbots and content recommendation systems.
5. Is better to build your own knowledge graph before Google builds it for you or vice versa?
The more we make information accessible to others, the more we can achieve our marketing goals. Google’s KG is constantly looking for corroborating information. Publishing data as structured linked data facilitates this process. The real question we need to ask ourselves, though, is another one. Are we doing all of this for Google only? The reality is that today’s organizations are heavily investing in automation. Whether it is about finding new untapped search opportunities, automating content creation using today’s large language models (NLG), or implementing conversational AI, we all need a graph-based structure that describes the content that we produce. A KG is a foundational building block for your AI strategy. It is no longer just for Google, Bing, or Amazon. A KG is for every organization looking at growth in a strategic way.
6. What is the role of Natural Language Processing (NLP) in creating knowledge graphs?
A knowledge graph is a data structure that we use to model information using edges (entities) and typed relationships (properties that link entities to each other). We integrate and share data by defining a topology of entities. The most elementary level of a knowledge graph is a triplet made of Subject-Predicate-Object. Each triplet establishes the connection between two entities in the graph.
We use NLP to extract information from unstructured content like web pages, emails, or any form of textual content. Here at WordLift, we developed our NLP stack to detect named entities from over 100 different languages. In layman’s terms, we can say that a named entity represents the proper name of any object. Andrea Volpini is the name of a Person; Google is the name of an Organization. When analyzing a sentence, the NLP will first recognize the entity’s presence in the text by extracting the label. Then it will disambiguate this term to correctly identify the right named entity based on the context of the sentence (is jaguar the car company or the large cat?).
7. How to choose the right focus keyword?
In general, I always find a good practice to analyze the searcher personas at the very beginning. Personas are examples of typical customers that you are targeting. We can identify a core intent (the primary information need of the searcher personas) and analyze the various queries that will gravitate around this intent.
An intent is a broad concept, but we can practically see the intent by looking at what results Google provides for each query. Some queries are more transactional (“I want a pair of sunglasses for skiing”), some are more informational (“What is linked data?”). Some will work for a personas, and some will not. We typically analyze intents by classifying them using unsupervised machine learning techniques. I can look, for example, at a query like “What is linked data?” and evaluate if this will resonate best with SEOs, developers, or content marketers.
Another critical tactic here is to create a micro knowledge graph by analyzing with NLP the top results on Google behind the query. Here, for instance, we can see that, within the first result (rankGroups = 1), one of the entities being detected is SPARQL. While on one side, this indicates that the text we want to write to rank on “What is linked data?” shall mention SPARQL. Using this API, we can also check:
- if we have already this entity in our KG (and therefore on the content of our website);
- if alternatively, we have semantically similar concepts elsewhere that we could use for creating internal links.
Choosing a target keyword is also about leveraging our existing content.
8.What do you think about multimodal search?
In the context of AI research, we’re witnessing the evolution from single-modality models to web-scale multi-modal models that are trained on different input modalities (i.e., text, audio, images, and video). I found it truly exciting; our brain also derives meanings from multiple senses (touch, sight, hearing, smell, and taste). This evolution will make it easier to work on SEO by combining the various modalities. This is the reason we’re happy to partner with Jina AI to bring true multi-modal capabilities to our clients in the context of e-commerce, where we see the highest potential.
I love to work on the two sides of the coin. On one side, we have to learn how to optimize the content to rank on Google Lens, and, on the other side, we want to re-use this same data to improve on-site search with multimodality.
9. What are your tips on how to optimize content for Google Discover?
Discover’s traffic has changed over the years dramatically. It is an essential stream of traffic in various verticals, and to optimize content, I have worked on a checklist for Google Discover that I keep on updating. There are though two important additional insights worth sharing:
- Web Stories are working exceptionally well on Discover.
- Discover is working on content that is more similar to the content we find on social media than the content that ranks on traditional search.
I recently started to cluster featured images from top-ranking blog posts on Google Discover. By doing so, I could see that influencers and social-media-like content that didn’t perform well on traditional search have skyrocketed on Discover. My advice is to think in these exact terms. Discover is Google’s new attempt to create traffic that can compete with social networks. It also means that the expectations in terms of conversion are different from organic search. People will absorb content with a different attitude, more similar to scrolling stories on Instagram than running a search engine query.
10. How to make content accessible to voice search and virtual assistants? Could you please name the main steps?
A good starting point for voice search is to build Actions for the Google Assistant using Web Content. FAQs, HowTo guides, Recipes, News, Videos, and Podcasts can immediately appear in voice search as long as they are properly annotated with structured data. This is really the entry point.
From there, we can start creating our own conversational AI by re-using the same data and the relationships inside a knowledge graph. Chatbots greatly benefit from semantically rich data stored in a graph database.
Knowledge Graph represents the evolution of the so-called symbolic AI where data is modeled using rules, vocabularies, and ontologies. A chatbot that uses a knowledge base scales more quickly on various intents without massive training data. In today’s systems, we combine deep learning and symbolic AI to let the user converse with the content on our website while minimizing the cost of content management. In other words, we want to re-use, as much as possible, the optimization work that we do for SEO as part of the workflow for managing conversational user interfaces.
11. SEO is such a controversial thing. Every SEO specialist rates its success differently. What is your way to rate SEO success? What metrics do you look at?
We need to look at the business we’re serving. I start by understanding the business model and the searcher personas. ROI is crucial in SEO. We need to know how the company runs and possibly calculate the ROAS for every initiative. Everything, where ROAS is >= 3, is a success, everything below is a cost, but we can still learn from it. Sometimes it is enough to evaluate clicks from the Google Search Console; other times, you’ll need to look at the conversion rate and the average ticket value, but it’s always critical to turn strategies into measurable business value. Looking at the business value will also help you establish a common ground with the client. We speak the same language, and we want to evaluate results with the same metric.
12. Is link building important for increasing the website’s positions nowadays?
It can be, I am not an expert though nor a strong advocate. I would recommend focusing on Semantic SEO, structured data, topical authority and internal links.
13. What are the top trends for SEO in 2022?
I have identified the following trends for SEO in 2022. They are centered on creating scalable human-machine publishing workflows:
- Multimodality and Multilinguality: From a Web Index to Pre-Trained Models
- Optimised On-site Search
- Phygital and Local SEO
- Conversational User Interfaces
- Climate-neutral websites
14. Does your university degree help you to succeed in your career?
I don’t have one 🙊. I dropped out of the University where I was getting a bachelor’s in mass communication science; the Web was about to disrupt every industry, and I love to learn by doing. Having said so, I strongly believe in the value of traditional learning and academic research. Thanks to the research work that the team at the Semantic Technology Laboratory (STLab) of the CNR was doing back in 2013, we began to work on a semantic editor that later became WordLift. Even today, we’re proud to actively work on research programs with Universities and research institutions to bring our technology forward.
15. Are you going to surprise the world with something new (tool/app, course, product)?
We are about to launch a new Google add-on for Google Sheets to help clients find new search opportunities and create knowledge graphs from search queries. We want to understand the search intent behind a query by analyzing top results with named entity recognition and linked data. At the same time we want to provide an immediate solution for the cold start problem. What entities should I have in my knowledge graph? The add-on will help us creating the first entities to build traffic around specific queries. We are very excited about it and we plan to go live in the coming weeks 🤩.
16. How do you see the future of SEO (in 5 years)?
SEO is becoming a new frontier in an AI-driven society. SEO is the connecting dot between human knowledge and machines. I believe we all share the responsibility of creating an AI that follows well-defined ethical guidelines and respects fundamental human values (individual rights, privacy protection, non-discrimination, fairness, and non-manipulation).