Chatbots have quickly become part of everyday life, especially since ChatGPT’s release in November 2022, which popularised Large Language Models (LLMs). According to web statistics, ChatGPT – one of the many chatbots available – saw 1.8 billion total visits alone in April. Chatbots are so ubiquitous they are now used professionally across a range of industries. Teachers use them to create lesson plans, programmers to write code and realtors to write property listings.
There has even been debate over whether chatbots could replace search engines altogether. A March 2023 survey found 42% of professionals envisage predominantly using AI chatbots for online queries in future.
Chatbots and online reputation
In terms of online reputation, chatbots can present a big issue. Hallucinations, which occur when a chatbot generates a false or fabricated response because it has misinterpreted the data or the data does not exist, can present major reputational concerns.
That’s why it is important for everyone – but high-profile individuals and companies in particular – to understand which sources chatbots use and how they prioritise the information from these sources when presenting responses to queries about them.
To get to the bottom of this issue, we looked into the types of sources chatbots use and where these sources rank on Search Engine Results Pages (SERPs).
How we did our research
For this piece of research, we used three well-known chatbots:
Our aim was to find out whether there is a correlation between the sources chatbots use and SERP rankings, and establish which source types are the most influential in chatbot responses.
To do this, we entered a query and compared the sources cited by chatbots for their claims with the ranking of the same query on the first 20 pages of search engine results. For reference, Gemini uses Google as the basis for its research, while ChatGPT and Perplexity use Bing.
We categorised the chatbot sources into the following categories:
For our queries, we examined entities for the following categories: “Fortune 500 CEOs”, “High-profile families”, “Fortune 500 companies” and “Private US companies”. From looking at these queries over Gemini, ChatGPT and Perplexity, we analysed 1,544 different claims made by chatbots.
The results
For our queries on Fortune 500 CEOs, the chatbots returned 376 claims. The majority (54%) of sources cited are located on Page 1 of SERPs, 30 points ahead of the next closest result. This is especially prevalent for Perplexity, with 61% of all sources located on Page 1.
In terms of type of source, owned assets proved the most prevalent, accounting for 43% of sources cited. Wikipedia is the second most common source at 21%.
For our queries on high profile families, the three chatbots returned 386 claims. The majority of sources (58%) are located on Page 1, 36 percentage points ahead of the next closest result. In this instance, ChatGPT used the most sources on Page 1.
For our queries on ruling families, Wikipedia is the most popular source comprising 43% of all sources cited by all three chatbots. Ruling families proved to be the only category where mainstream news is the second most prevalent source, comprising 25% of all sources cited. The preference for Wikipedia and mainstream news on this particular topic is largely due to many high-profile families having a reputation dominated by historical information.
For our queries on Fortune 500 companies, the three chatbots returned 321 claims. This set of queries is the only instance in which sources outside the first 20 pages of Google proved more prevalent than Page 1 sources, comprising 42.1% claims compared to 41.7%. This is largely driven by Gemini and ChatGPT which, in their responses to queries on the company, tended to focus on press releases reporting on recent changes in the company, released over the past year. These press releases are often from smaller outlets or company newsrooms and do not rank strongly on SERPs. Perplexity, which tended to give profiles encompassing the company’s operations, continued to overwhelmingly favour Page 1 sources.
The three chatbots favoured owned assets as a source, using it in 55% of claims in total. This total number is increased largely by both ChatGPT and Gemini heavily favouring owned assets as sources by 55 and 43 percentage points respectively. These owned assets emerge in chatbot responses citing company press releases and website information for their profiles. ChatGPT, in particular, tended to focus its responses around recent news on the companies and used company website information as a source regularly.
For our queries on privately listed US companies, the three chatbots returned 461 claims and 66% of sources are located on Page 1, making it the most common location for sources cited. This is in large part driven by Perplexity which overwhelmingly favoured Page 1 sources by a greater margin than in any other set of queries for any of the chatbots. This is partly down to Perplexity heavily favouring Wikipedia or the home page of a company’s website as its source. These are almost always located on Page 1 for an entity. Both ChatGPT and Gemini used Page 1 sources the most, albeit to a lesser extent.
Wikipedia proved the most popular domain sourced, cited in 45% of claims. However, having a collection of owned assets appears to be just as important. Both are among the top source categories people check and both can significantly influence a person’s online profile. This reflects the findings for Fortune 500 Companies, where the chatbots heavily favoured owned assets as well.
Overall, the majority of sources on both ChatGPT and Perplexity are ranked on Page 1, while Gemini actually tended to use sources outside the first 20 pages of Bing and Google. Perplexity has a much stronger first page bias than ChatGPT, using Page 1 sources 66% of the time compared to 48% for ChatGPT.
Surprisingly, owned assets proved to be the most prevalent source, with Wikipedia being the single most used website, consulted only 28 fewer times than the variety of owned assets used. Alternative media proved the least common source type with blog posts occasionally being used.
The takeaways
From our research, we can now confirm that what shows up on Page 1 of a SERP is related to key chatbots’ answers to queries, particularly for Perplexity.
As a result, the appearance of positive and trustworthy assets on Page 1 may reduce the likelihood of chatbots drawing on nefarious sources.
However, maintaining a positive and accurate Page 1 should not be the only focus, as 27% of sources consulted for responses did not appear on the first 20 pages of Bing and Google.
Another big finding is the prevalence of owned assets as a source for chatbots, alongside Wikipedia. These two sources make up 72% of all sources used by chatbots. That means companies need a factually accurate Wikipedia page and detailed online assets describing the company and individuals if they want chatbots giving out accurate information about them.
Potential red flags
These findings raise some issues. Firstly, users may be dissuaded from using chatbots to research companies if they feel as though the response they receive is simply a corporate puff piece. Secondly, there’s potential for hostile actors to create content impersonating a company’s online assets in the hopes of confusing the chatbots into assuming it is the company’s page.
Users may also be dissuaded from using chatbots for research on entities due to their heavy reliance on Wikipedia as a non-owned asset source. Often, responses from chatbots are taken word for word from Wikipedia.
Other interesting observations
In four cases, chatbots cited documentaries or video clips as sources for its claims. For example, in response to the query about a high-profile family, Perplexity listed as one of its additional sources an episode from a BBC documentary on the family.
There were also cases where chatbots would make accurate claims, however, the information was not found within the source directly attributed to the claim. For example, in a query about a high-profile chairman, ChatGPT accurately described him as the chairman of the company, citing Wikipedia as the source. However, this information does not appear on Wikipedia.
While we identified very limited hallucinations in our research, none of these hallucinations dramatically altered the profile of the entity presented in the query. For example, in Perplexity’s response to the query about a Fortune 500 company, it stated a fact about the company. However, the source Perplexity provided did not include this information and other sources contradicted the fact. Overall, hallucinations only appeared in a very limited number in our research.
Privacy Policy.
Revoke consent.
© Digitalis Media Ltd. Privacy Policy.
Digitalis
We firmly believe that the internet should be available and accessible to anyone, and are committed to providing a website that is accessible to the widest possible audience, regardless of circumstance and ability.
To fulfill this, we aim to adhere as strictly as possible to the World Wide Web Consortium’s (W3C) Web Content Accessibility Guidelines 2.1 (WCAG 2.1) at the AA level. These guidelines explain how to make web content accessible to people with a wide array of disabilities. Complying with those guidelines helps us ensure that the website is accessible to all people: blind people, people with motor impairments, visual impairment, cognitive disabilities, and more.
This website utilizes various technologies that are meant to make it as accessible as possible at all times. We utilize an accessibility interface that allows persons with specific disabilities to adjust the website’s UI (user interface) and design it to their personal needs.
Additionally, the website utilizes an AI-based application that runs in the background and optimizes its accessibility level constantly. This application remediates the website’s HTML, adapts Its functionality and behavior for screen-readers used by the blind users, and for keyboard functions used by individuals with motor impairments.
If you’ve found a malfunction or have ideas for improvement, we’ll be happy to hear from you. You can reach out to the website’s operators by using the following email webrequests@digitalis.com
Our website implements the ARIA attributes (Accessible Rich Internet Applications) technique, alongside various different behavioral changes, to ensure blind users visiting with screen-readers are able to read, comprehend, and enjoy the website’s functions. As soon as a user with a screen-reader enters your site, they immediately receive a prompt to enter the Screen-Reader Profile so they can browse and operate your site effectively. Here’s how our website covers some of the most important screen-reader requirements, alongside console screenshots of code examples:
Screen-reader optimization: we run a background process that learns the website’s components from top to bottom, to ensure ongoing compliance even when updating the website. In this process, we provide screen-readers with meaningful data using the ARIA set of attributes. For example, we provide accurate form labels; descriptions for actionable icons (social media icons, search icons, cart icons, etc.); validation guidance for form inputs; element roles such as buttons, menus, modal dialogues (popups), and others. Additionally, the background process scans all of the website’s images and provides an accurate and meaningful image-object-recognition-based description as an ALT (alternate text) tag for images that are not described. It will also extract texts that are embedded within the image, using an OCR (optical character recognition) technology. To turn on screen-reader adjustments at any time, users need only to press the Alt+1 keyboard combination. Screen-reader users also get automatic announcements to turn the Screen-reader mode on as soon as they enter the website.
These adjustments are compatible with all popular screen readers, including JAWS and NVDA.
Keyboard navigation optimization: The background process also adjusts the website’s HTML, and adds various behaviors using JavaScript code to make the website operable by the keyboard. This includes the ability to navigate the website using the Tab and Shift+Tab keys, operate dropdowns with the arrow keys, close them with Esc, trigger buttons and links using the Enter key, navigate between radio and checkbox elements using the arrow keys, and fill them in with the Spacebar or Enter key.Additionally, keyboard users will find quick-navigation and content-skip menus, available at any time by clicking Alt+1, or as the first elements of the site while navigating with the keyboard. The background process also handles triggered popups by moving the keyboard focus towards them as soon as they appear, and not allow the focus drift outside of it.
Users can also use shortcuts such as “M” (menus), “H” (headings), “F” (forms), “B” (buttons), and “G” (graphics) to jump to specific elements.
We aim to support the widest array of browsers and assistive technologies as possible, so our users can choose the best fitting tools for them, with as few limitations as possible. Therefore, we have worked very hard to be able to support all major systems that comprise over 95% of the user market share including Google Chrome, Mozilla Firefox, Apple Safari, Opera and Microsoft Edge, JAWS and NVDA (screen readers), both for Windows and for MAC users.
Despite our very best efforts to allow anybody to adjust the website to their needs, there may still be pages or sections that are not fully accessible, are in the process of becoming accessible, or are lacking an adequate technological solution to make them accessible. Still, we are continually improving our accessibility, adding, updating and improving its options and features, and developing and adopting new technologies. All this is meant to reach the optimal level of accessibility, following technological advancements. For any assistance, please reach out to webrequests@digitalis.com