6 months – that’s how far back you can access your Bing search data. A year from now, will you be able to analyse how Bing searches changed following the ChatGPT integration?

Bing’s recent spotlight following the ChatGPT integration should encourage enterprises to review their Bing data collection pipelines. Depending on the pace of innovation and differentiators between ChatGPT and Bard, Google’s eyewatering 93% market share may soon be challenged. Moreover, GPT4 may well exacerbate Microsoft’s AI lead over Google.

Considering today’s state of the search engine market, businesses have, as expected, deprioritised Bing data pipelines. Even though it’s still early days for AI-enhanced search, Microsoft’s stake in OpenAI will likely mean that further AI developments and other value-adding services will likely be integrated in Bing. This can mean increased end-user interest in Bing, which directly translates into a need for enterprises to collect and analyse Bing data.

Even though the market is unlikely to shift considerably in the near future, businesses may see that their search engine traffic be more distributed in the next few years. However, if Bing data pipelines are not set up, months or years’ worth of valuable search data will be lost. Today, Bing’s data retention policy is six months. A simple data ownership pipeline put in place today may prove invaluable a few years down the road.

How to Get Ownership of Bing Search Data

Bing Webmaster Tools data can be programatically accessed via the Bing Webmaster API and Bingbot data through web server logs.

Bing’s Webmaster API contains a comprehensive set of well-documented endpoints for user search and search engine management. The API can be used to create simple pipelines which write search and index data into any storage service. This approach offers enterprises full control over the retention policies, meaning that they would no longer be dependent on Bing’s policies. Setting up the infrastructure to enable longer data retention policies while it is still early days will provide you with the long-term visibility needed for data analysis.

Web server log processes for some enterprises are entirely designed for Google and Googlebot. They don’t factor in Bing or other search engines such as Yahoo!, Yandex, Baidu, or Naver. In those instances, enabling Bingbot data collection in the server log process may entail modifying the whole underlying server log process. These changes can take two possible forms:

  1. Inserting Bingbot into the existing server log system – Simply adding Bingbot within the server log system along Google may create downstream issues for analytics tools if the downstream is only expecting Googlebot data. The log entries will need to be tagged to distinguish between Google and Bing-produced entries.
  2. Creating a separate Bingbot pipeline – to leave the Googlebot server log system untouched, we need to create a separate Bingbot pipeline. This approach requires additional upfront effort, but allows more control and flexibility to manage each search engine server log pipeline independently.

Early 2023 can be an inflexion point in the search market. Being able to monitor changes from this point forward can bring numerous long-term advantages. However, this all depends on having data collection pipelines set in place to get ownership of your websites’ search data. To find out more about how Merj can help you set your Bingbot data pipelines, contact us today.