On pre-bid bot detection and filtration - Are ad tech vendors serving US Government and Fortune 500 brands’ digital ads to bots?

Executive Summary

  1. During the course of several analyses on behalf of Fortune 500 advertisers, impression level log file data and financial invoices suggested that advertisers were billed by ad tech vendors for ad impressions served to declared bots operating out of known data center server farms. Some of these bots are on the IAB Tech Lab Bots & Spider List, and some of the data center IP addresses are on an industry reference list of known datacenter IPs with non-human (bot) traffic.

    1. Some of these brands reported that they were spending millions of dollars each year specifically for “bot avoidance” pre-bid segments and technology from various ad verification vendors. This technology is allegedly intended to prevent advertisers’ ads from being served to bots, invalid, or fraudulent traffic.

  2. Adalytics obtained access to three different web visitation datasets generated entirely by bots. The cumulative size of these internet crawling bot datasets is over a petabyte of web traffic data, with more than two million different websites crawled, and more than seven years worth of crawling data. The data includes web crawling sessions bots based in the USA, Canada, Latin America, Europe, Singapore, Australia, New Zealand, and Japan.

    1. All three of the bot datasets crawl the web for “benign” reasons, such as academic research or cybersecurity. None of these three bots is designed to commit “ad fraud” or intentionally generate ad revenue or invalid traffic. The archiving bots run and access the Internet from data centers. Some of the bots announce themselves by declaring a known bot user agent. This enables web sites, ad tech vendors, and verification vendors to know upfront who or what is requesting their services and where the traffic originates from.

  3. Some ad tech vendors appear to be serving audience targeted or “personalized” ads (including retargeting ads) to bots in data centers that have no stateful attributes, cookies, session storage, fingerprints, user IDs, or browsing history.

  4. These bot datasets reveal that thousands of different brands - including Procter & Gamble, Hershey’s, HP, Wall Street Journal, T-Mobile, United States Postal Service, Pfizer, IBM, JPMorgan Chase, Bank of America, Microsoft, Haleon, Australian Defence Force, New York, Utah, Oregon, Florida, Indiana & Virginia state governments, Bayer Healthcare, MasterCard, Ernst & Young, Visa, Kenvue, Pfizer, US Bank, Unilever, Disney, American Express, Beam Suntory, Diageo, and United States government agencies such as the New York City Police (NYPD), Department of Homeland Security (DHS TSA), US Census, healthcare.gov, US Army, US Air Force, US Navy, Department of Veterans Affairs, and the Centers for Disease Control and Prevention - had their ads served by ad tech vendors to bots in data centers for at least 5+ years (since at least 2020).

  5. Many of these ads which were served to bots appear to be using various “pre-bid” bot avoidance or filtration tools from vendors such as Integral Ad Science (IAS) and DoubleVerify. For example, the US Navy had ads served to declared bots running out of Google Cloud server farms, whereas the US Navy’s ads contain source code references to “charge-allDoubleVerifyBotAvoidance”.

  6. In just the observed sample dataset, Google was observed serving thousands of healthcare.gov ads (from the U.S. Centers for Medicare & Medicaid Services) for years to bots running out of data center server farms.

  7. Many of the ad tech platforms which were observed serving ads to declared bots in data center server farms have declared publicly that they filter bot traffic pre-bid through partnerships with HUMAN Security (f/k/a “White Ops”). For example, Google, Trade Desk, Pubmatic, Magnite, Index Exchange, Triplelift, OpenX, Microsoft Xandr, Kargo, GumGum, Media.net, Sonobi, and Equativ (f/k/a Smart AdServer) have made public statements about partnering with HUMAN Security to prevent ads from being served to bots. Additionally, some of these ad tech vendors also have announced partnerships with DoubleVerify and IAS.

    1. Trade Desk and Google’s own ads promoting their own products or services were observed serving to these bots.

  8. Some advertisers such as the Government of Ontario, Canada and Progressive (insurance) appear to be utilizing DoubleVerify’s Scibids to target their ads. Scibids is a subsidiary that allegedly uses AI to target ads. It appears that some of these DoubleVerify Scibids-targeted ads were served to declared bots with no long term cookies or user IDs running out of data center server farms.

    1. Ads promoting Scibids’ own products were served to bots with no pre-existing user IDs by Google

  9. IAS’s publisher services pixel appears to label declared bots (whose user agents are on the IAB Tech Lab Spiders and Bots list since 2013) running out of known data center server farms as valid, human traffic 17% of the time (meaning for every 100 declared bots observed herein, IAS labels 17 bot page view sessions as valid, human traffic). For non-declared bots operating out of data centers or other IPs, IAS’s pixel appears to label the bot traffic as valid, human traffic 77% of the time in the observed sample dataset.

    1. In some cases, it appears that IAS’ publisher optimization tool labeled the same bot and page view session as both valid, human traffic (“fr=false”) and simultaneously as invalid, bot traffic (“fr=true”).

    2. Furthermore, in some cases, even when IAS’ publisher optimization tool identified a given user as a bot, there were still ads served to the bot on behalf of advertisers who appeared to be using IAS’ advertiser-side tools.

  10. Many premium publishers and ad tech industry trade press websites were observed serving ads to bots. For example, the trade press adweek.com and adage.com were observed serving Trade Desk ads promoting Trade Desk’s own services to bots. Many publishers which appear to use IAS’ and/or DoubleVerify publisher optimization tools - such as Wall Street Journal, Reuters, Fandom, Forbes, Weather.com, Condé Nast, and Washington Post - were observed serving major advertisers’ ads to bots in data centers for years

  11. Many media agencies were observed transacting clients’ ads to declared bots on the IAB Tech Lab Spiders and Bots list running out of data center server farms. These included: GroupM / WPP, Interpublic Group (IPG), Dentsu, Publicis Groupe, Havas, Omnicom, Horizon Media, and MiQ.

  12. Google Ad Manager (GAM) - a publisher ad server - was observed transacting millions of ad impressions from thousands of different advertisers to bots. Some of these bots were declared bots operating out of Google’s own data center.

  13. YouTube was observed serving TrueView skippable in-stream video ads for many different brands for years to bots running out of data center server farms - via the Google Video Partners (GVP) network. Some of these TrueView ads were served to declared bots operating out of Google Cloud data centers. Brands whose TrueView ads were served to bots include ads for Senator Mike Lee (R-UT), the Wall Street Journal, Miami-Dade County government Special Victims Bureau, Sexual Predator and Offender Unit (SPOU), Ernst & Young, Lego, Kelloggs, Kenvue (f/k/a J&J Consumer), Mercedes-Benz, and Visa.

Executive Summary 1

Introduction 6

Background 7

What are bots, and how prevalent are they on the open internet ? 7

What is the estimated financial impact of bot traffic or ad fraud on digital advertising ? 10

Background - What is “invalid traffic” in ad tech industry jargon? 11

Background - What is the IAB Tech Lab Spiders and Bots list ? 15

Background: OpenRTB Protocol Guidance on ad bidding and serving to bots 16

Background - Trustworthy Accountability Group (TAG) - Certified Against Fraud 18

Background - Media Rating Council (MRC) - Invalid Traffic Detection and Filtration Standards Addendum 19

Background - Media Rating Council (MRC) Accreditation for IVT Detection 20

Background - Ad tech vendors which partner with HUMAN Security 23

Background - What is “Pre-bid targeting” in digital advertising? 32

Background - Integral Ad Science (IAS) ad fraud solutions 33

Background - DoubleVerify ad fraud solutions 38

Background - Prior research on bot detection and filtration vendors’ technology by Shailin Dhar - “Mystery Shopping Inside the Ad Fraud Verification Bubble” 38

Background: US Senators Letters ask Federal Trade Commission to investigate “willful blindness to fraudulent activity in the online ad market” 41

Adalytics public interest research objectives 42

Research Methodology 42

Research Methodology - Bot web traffic source #1 - HTTP Archive 43

Research Methodology - Bot web traffic source #2 - Anonymous web crawler vendor 51

Research Methodology - Bot web traffic source #3 - URLScan.io 52

Research Methodology - Examination of putative bot traffic classifications from IAS on various sites 56

Research Methodology - Digital forensics of ad source code 60

Research Results 72

Research Results: IAS’s publisher services pixel appears to label bots as valid human traffic in certain occasions 72

Research Results: Publishers which appear to partner with IAS and DoubleVerify and were seen serving ads to declared bots in data centers 84

Research Results: Ad tech vendors who claim to partner with HUMAN Security that were observed serving ads to bots 93

Which demand side platforms (DSPs) were observed transacting the highest relative number of ad impressions to bots in Oct 2024? 93

Google DV360 ads served to bots 98

Trade Desk and various SSP ads served to bots 119

Ads served by Trade Desk Open Path to declared bots on the IAB Bots List operating out of known data center IP addresses 163

Private marketplace (PMP) and other curated ad deals served to bots by vendors which allegedly partner with HUMAN Security 166

Ads promoting the Trade Desk itself - in its capacity as an advertiser - served to bots by Trade Desk and other vendors that claim to partner with HUMAN Security 170

Research Results: Brands whose ads include references to “charge-allDoubleVerifyBotAvoidance” and had their ads served to bots 177

Research Results: Brands whose ads appear to be mediated by DoubleVerify’s Scibids AI technology and served to bots 193

Research Results: Brands whose ads include references to “charge-allIntegralQualitySync” or “charge-allIntegralSuspiciousActivity” and had their ads served to bots 209

Research Results: YouTube TrueView ads served to bots on Google Video Partner sites 227

Research Results: Google’s publisher ad server appears to have served millions of impressions from thousands of advertisers to bots, including declared bots operating out of Google’s own data centers 229

Research Results: Hundreds of thousands of United States government healthcare.gov ads were served to bots operating out of Google’s own data centers 230

Research Results: Publishers who were seen consistently NOT serving ads to HTTP Archive’s declared bot 232

Conclusion 246

Caveats & Limitations 246

Caveats & Limitations - Questions & Answers (Q&A) 247

Discussion 248

Introduction

Over the course of the last year, Adalytics has conducted several research reports and worked with various public sector and Fortune 500 brand advertisers and media agencies to analyze their marketing data.

Several media buyers requested that Adalytics review their ad delivery data and/or log files generated by their advertising vendors. Some of these advertisers reported that they were paying several million dollars per year for bot avoidance pre-bid segments or technology. According to the advertisers, this technology was designed as an extra insurance policy to ensure that the advertisers’ ads would not be served to bots or “invalid traffic” (industry jargon for ad delivery that does not conform with various industry standards and buyers’ expectations, such as mis-declared or falsified ad placements).

The vendors providing this technology to advertisers, publishers and media agencies have various certifications from various advertising trade groups or accreditation bodies.

Despite paying for this technology, the brands were surprised to observe that their log files contained evidence that the brands had been charged and billed by their ad tech vendors for ads that were served to openly declared bots. Some brands reportedly saw that their ads were served to “Headless Chrome”, various AI company crawlers and scrapers, and search engine crawlers. Some brands reportedly even had their ads served to a contextual classification bot operated by the very ad verification company the brands were paying millions of dollars to for bot avoidance.

In a few instances, it appeared that a verification vendor who was providing bot avoidance technology was marking significant percentages of an openly declared bot operating out of a known data center as valid, human traffic.

Furthermore, during the course of several research projects, Adalytics observed that the United States Navy, US Army, Centers for Disease Control and Prevention, Department of Veterans Affairs, and US Postal Service had their ads served to confirmed bots. Even members of Congress, such as Republican Senator Mike Lee, sponsor of the AMERICA Act, were observed as having their ads served to bots.

Prompted by these experiences and observations, the requests of numerous Fortune 500 brand marketing executives, and the public interest component of the United States government itself being affected, Adalytics decided to perform exploratory research to try to better understand the phenomenon of how ad tech vendors allow digital ads to be served to bots in data centers.

Background

This section of the report is intended to provide readers with background on what bots are, the prevalence of bot traffic on the open internet, various advertising industry jargon, advertising industry standards regarding ad delivery to bots, and prior research investigating the role of bot avoidance technology vendors.

What are bots, and how prevalent are they on the open internet ?

According to Cloudflare, a hosting company, “Bot traffic describes any non-human traffic to a website or an app. The term bot traffic often carries a negative connotation, but in reality bot traffic isn’t necessarily good or bad; it all depends on the purpose of the bots. Some bots are essential for useful services such as search engines and digital assistants (e.g. Siri, Alexa). Most companies welcome these sorts of bots on their sites. Other bots can be malicious, for example those used for the purposes of credential stuffing, data scraping, and launching DDoS attacks. Even some of the more benign ‘bad’ bots, such as unauthorized web crawlers, can be a nuisance because they can disrupt site analytics and generate click fraud.”

HUMAN Security, a company that provides technology to filter or block bots, states: “When people visit your site to view your products and make purchases, that’s human traffic. When automated software — also called a bot — visits your site, that’s bot traffic.

Source: HUMAN Security

Cloudflare further describes the negative consequences of bot traffic. Cloudflare states: “unauthorized bot traffic can impact analytics metrics such as page views, bounce rate, session duration, geolocation of users, and conversions. These deviations in metrics can create a lot of frustration for the site owner; it is very hard to measure the performance of a site that’s being flooded with bot activity. Attempts to improve the site, such as A/B testing and conversion rate optimization, are also crippled by the statistical noise created by bots [...] For sites that serve ads, bots that land on the site and click on various elements of the page can trigger fake ad clicks; this is known as click fraud.

HUMAN Security states: “Automated traffic taxes your infrastructure and increases your costs for bandwidth and compute cycles. Bot traffic can overwhelm your network and slow site performance, which frustrates customers and negatively impacts user experience [...] In addition, bot traffic contaminates your data and skews your analytics — for example, by appearing to show an increase in consumer demand. Flawed metrics on user behavior can lead you to make poor business decisions about pricing, stocking goods, and investing in marketing and advertising.”

Cloudflare further states: “It is believed that over 40% of all Internet traffic is comprised of bot traffic, and a significant portion of that is malicious bots” (emphasis added).

The website of the Audit Bureau of Circulations (ABC UK), a British non-profit that develops standards of media brand measurement, states that “Up to 40% of web traffic is invalid [...] Designed to inflate website and ad campaign numbers, invalid traffic has an impact on advertising costs and can skew reported data making it difficult to measure engagement and traffic sources.”

Source: ABC UK and Cheq

The 2024 Imperva Threat Research reportreveals that almost 50% of internet traffic comes from non-human sources. Bad bots, in particular, now comprise nearly one-third of all traffic. Bad bots have become more advanced and evasive and now mimic human behavior in such a way that it makes them difficult to detect and prevent.”

According to Akamai Technologies research in June 2024, another hosting company, “bots compose 42% of overall web traffic, and 65% of these bots are malicious.”

According to The Atlantic, “The Internet Is Mostly Bots - More than half of web traffic comes from automated programs—many of them malicious”, citing a 2017 study “which is based on an analysis of nearly 17 billion website visits from across 100,000 domains.”

Source: The Atlantic

Bot traffic can also increase the electricity consumption and Scope3 carbon emissions associated with advertisers media supply chains. According to a December, 2023 research report from company called Scope3 [which monitors carbon emissions associated with digital advertising], “fraud in US programmatic display contributes an estimated 353k mt of carbon emissions.

Source: Scope3

What is the estimated financial impact of bot traffic or ad fraud on digital advertising ?

In addition to the consequences of unfiltered bot traffic on web measurement and analytics, as well as on Scope3 carbon emissions, various entities have sought to estimate the financial impact of bot traffic or ad fraud on the digital advertising industry.

It bears mentioning that there are many forms of ad fraud beyond or outside of bot traffic. For example, one form of ad fraud may involve a vendor billing an advertiser for digital ads that were never served at all, or served to real humans but on a lower quality context than what was contractually agreed upon.

Different studies have tried to quantify the entire scale of ad fraud, or focused just on the subset of ad fraud related specifically to bot traffic. Furthermore, it is worth noting that not all bot traffic is fraudulent. For example, some bots have benign or beneficial activities, such as search engine crawling and indexing, or monitoring website quality. Readers should be careful not to assume that all bots are malicious or engineered specifically for the purpose of ad fraud. Bots can have beneficial functionality.

According to the World Federation of Advertisers (WFA), a trade group, “Ad fraud likely to exceed $50bn globally by 2025 on current trends second only to the drugs trade as a source of income for organized crime.

The Association of National Advertisers (ANA) published a study in 2017 that analyzed data collected from ten billion online ads purchased over a two-month period by 49 of the trade body’s advertiser members. One advertiser allegedly had “served 37 per cent of its online ad campaign to bots”.

The Association of National Advertisers (ANA) reported that “A false sense of security enables fraud to thrive.”

In 2022, the ANA wrote that “According to the results of a global 2019 study by Juniper Research, ad fraud costs the marketing industry an estimated $51 million per day, and these losses are likely to increase to $100 billion annually by 2023.” The ANA also wrote that “nearly 18 percent of all internet traffic in the marketing industry can be attributed to nonhuman bots, which are actively engaged in ad fraud.

Background - What is “invalid traffic” in ad tech industry jargon?

The Media Rating Council, an advertising industry accreditation body, published an “Invalid Traffic Detection and Filtration Standards Addendum” in 2020, defining “invalid traffic” (IVT).

The MRC’s Standard Addendum states that: “Invalid Traffic (IVT) is defined generally as traffic or associated media activity (metrics associated to ad and content measurement including audience, impressions and derivative metrics such as viewability, clicks and engagement as well as outcomes) that does not meet certain quality or completeness criteria, or otherwise does not represent legitimate traffic that should be included in measurement counts. Among the reasons why traffic may be deemed invalid is it is a result of non-human traffic (spiders, bots, etc.), or activity designed to produce IVT.”

The MRC Addendum defines General Invalid Traffic (GIVT) and Sophisticated Invalid Traffic (SIVT).

The MRC Addendum states: “General Invalid Traffic” or GIVT, consists of traffic identified through routine means of filtration executed through application of lists or with other standardized parameter checks. Key examples are:

  • Known invalid data-center traffic (determined to be a consistent source of invalid traffic; not including routing artifacts of legitimate users or virtual machine legitimate browsing);

  • Bots and spiders or other crawlers (except those as noted below in the “Sophisticated Invalid Traffic” category);”

  • “Non-browser user-agent headers or other forms of unknown browsers;”

  • “Pre-fetch or browser pre-rendered traffic (where associated ads were not subsequently accessed by a valid user”

The MRC Addendum states that “the second category, herein referred to as “Sophisticated Invalid Traffic” or SIVT, consists of more difficult to detect situations that require advanced analytics, multi-point corroboration/coordination, significant human intervention, etc., to analyze and identify. Key examples are:

  • Automated browsing from a dedicated device: Known automation systems (e.g., monitoring/testing), emulators, custom automation software and tools; [...]

  • Bots and spiders or other crawlers masquerading as legitimate users detected via sophisticated means; [...]

  • Invalid proxy traffic (originating from an intermediary proxy device that exists to manipulate traffic counts or create/pass-on invalid traffic or otherwise failing to meet protocol validation);”

Screenshot of the MRC Invalid Traffic Detection and Filtration Standards Addendum - Categories of IVT and Associated General Requirements

According to Google, “Invalid traffic includes any clicks or impressions that may artificially inflate an advertiser's costs or a publisher's earnings. Invalid traffic covers intentionally fraudulent traffic as well as accidental clicks. Invalid traffic includes, but is not limited to: [...] Automated clicking tools or traffic sources, robots, or other deceptive software.”

Screenshot of Google AdSense Help - Definition of invalid traffic

Source: Google Campaign Manager 360 Help - https://support.google.com/campaignmanager/answer/6076504?hl=en

According to HUMAN Security, an ad tech vendor, “Invalid Traffic (IVT) is generated when malicious bots view and click on ads in order to inflate conversion numbers, resulting in wasted advertiser spend.”

HUMAN Security defines General invalid traffic (GIVT) as “Usually originating from data centers, general invalid traffic is created by simple bots that are not meant to be malicious and are easier to spot with basic bot detection solutions. There are a few different categories of GIVT: Data center: When the IP address associated with non-human ad traffic traces to a server in a data center, it is considered bot traffic. Known crawler: Associated with automated scripts or programs—often called bots or spiders—that are coded to identify themselves as non-human, known crawler traffic is generally thought to be good and legitimate, although not for the purposes of counting ad impressions.”

HUMAN Security defines sophisticated invalid traffic (SIVT) as “Sophisticated and malicious bot activity that is intended to closely mirror human behavior, sophisticated invalid traffic comes from bots that are particularly good at evading detection. Ridding your traffic of SIVT requires advanced bot detection. There are a few different categories of SIVT: Automated browsing: When a program or automated script requests web content (including digital ads) without user involvement and without declaring itself as a crawler, it’s considered automated browsing. These programs and scripts are generally used for malicious purposes. - Ex. Botnets”

Screenshot of HUMAN Security’s documentation on “What are the different types of invalid traffic?”

The Trade Desk defines General Invalid Traffic as: “Invalid traffic that can be identified through routine methods of filtration.” The Trade Desk defines Sophisticated Invalid Traffic as: “A form of invalid traffic that requires advanced detection and analytics tools to identify.”

Background - What is the IAB Tech Lab Spiders and Bots list ?

The User-Agent header is an HTTP header intended to identify the user agent responsible for making a given HTTP request. When a person browses the web via a browser on their phone rather than on the desktop, their phone’s browser declares via the HTTP header that the user is browsing the web on a mobile browser (rather than a desktop browser).

A Chrome browser on Windows, a Firefox browser on Android, a Safari brower on iOS, and a Roku connected television device each have a distinct User-Agent header.

Similarly, when various “good” bots, such as GoogleBot or BingBot crawl the open internet, those bots openly declare themselves via the HTTP User-Agent header. If a website does not want to appear in Google or Bing search results, they can choose to block GoogleBot or Bingbot via their web hosting provider.

The Interactive Advertising Bureau (IAB) Tech Lab, an advertising industry trade group, and the Audit Bureau of Circulations (ABC) UK, a British trade group, create a reference list of known bots and spiders for ad tech vendors to use to detect and/or filter basic bot traffic.

According to the IAB Tech Lab, “The IAB Tech Lab publishes a comprehensive list of such Spiders and Robots that helps companies identify automated traffic such as search engine crawlers, monitoring tools, and other non-human traffic that they don’t want included in their analytics and billable counts [...] The IAB Tech Lab Spiders and Robots provides the industry two main purposes. First, the spiders and robots list consists of two text files: one for valid browsers or user agents and one for known robots. These lists are intended to be used together to comply with the “dual pass” approach to filtering as defined in the IAB’s Ad Impression Measurement Guidelines (i.e., identify valid transactions using the valid browser list and then filter/remove invalid transactions using the known robots list). Second, the spiders and robots list supports the MRC’s General Invalid Traffic Detection and Filtration Standard by providing a common industry resource and list for facilitating IVT detection and filtration.

Section 7.1 of the IAB Tech Lab OpenRTB 2.x protocol specifies that ad exchanges and bidders should filter impressions from known bots who openly declare their user-agent. The IAB Tech Lab states:

Background: OpenRTB Protocol Guidance on ad bidding and serving to bots

The IAB Tech Lab is an advertising industry trade group which publishes the open real time bidding (OpenRTB) protocol which governs programmatic ad auctions that connect media publishers who sell ad spots to advertisers who wish to purchase and display ads.

The IAB Tech Lab OpenRTB protocol includes guidance on how ad exchanges and ad buying platforms should action bot traffic visiting websites and generating potential ad auction requests.

“An important issue in RTB is when impressions are triggered by software robots mimicking web browsers. Such robots may be implicitly or explicitly driving these false transactions. The following represents a set of symmetric best practices for exchanges and bidders to help recognize and reject these events.

Responsibility of the exchange - Make best effort to classify and reject non-human traffic (NHT) requests for ads to the exchange via the following best practices:

  • (Recommended) Filter impressions from known spiders via user-agent classification.

  • (Recommended) Filter impressions from suspected NHT via a “detector”.

Responsibility of the bidder:

  • (Recommended) no-bid impressions from known spiders via user-agent classification.

  • (Recommended) no-bid impressions from suspected NHT via a “detector”.

  • Specify a no-bid reason code in either case.

Where:

  • For exchanges, filtering the impression means that the exchange should respond to the “ad call” with either a blank HTTP 204 response or an unpaid ad (PSA) and not offered to any bidders.

  • For bidders, filtering the impression means that the bidder should respond with a no-bid.

  • For both exchanges and bidders, the impression transaction records should be clearly marked in any logging systems and be removed from contributing to any event counts associated with planning, forecasting, and reporting systems.”

Source: https://github.com/InteractiveAdvertisingBureau/openrtb2.x/blob/main/implementation.md#712---id-match-method-guidance-

Background - Trustworthy Accountability Group (TAG) - Certified Against Fraud

The Trustworthy Accountability Group (TAG) is a digital advertising initiative and trade group.

TAG launched its “Certified Against Fraud (CAF) Program in 2016 to combat invalid traffic in the digital advertising supply chain.”

Screenshot of a quote from P&G Chief Brand Officer Marc Pritchard on the TAG website

To achieve the Certified Against Fraud Seal, any participating company must ensure that 100% of the monetizable transactions (including impressions, clicks, conversions, etc.) that it handles are filtered for both general invalid traffic (GIVT) and sophisticated invalid traffic (SIVT) in a manner compliant with a TAG-recognized standard for IVT detection and removal as referenced in Appendix A.

Source: TAG

All inventory handled by a participating company – including inventory on that company’s

owned and operated media properties as well as any inventory handled by that company on

behalf of a third-party partner – must be filtered for GIVT and SIVT in a manner compliant with a

TAG-recognized standard for IVT detection and removal as referenced in Appendix A.

Background - Media Rating Council (MRC) - Invalid Traffic Detection and Filtration Standards Addendum

The Media Rating Council (MRC) is a US-based non profit organization that manages accreditations for media research and rating purposes. It performs accreditations for rating and research companies like Nielsen, comScore, and multiple digital measurement services. The MRC does not conduct the audit of the companies being accredited itself. The audits are done annually by accounting firms such as Ernst & Young. The company being accredited pays for the audit, with fees that could be in the hundreds of thousands or even millions of dollars.

The Media Rating Council has published Standards documents for “Invalid Traffic Detection and Filtration”.

The document “presents additional Standards for the detection and filtration of invalid traffic applicable to all measurement organizations of advertising, content and related media metrics (including outcome measurement) subject to accreditation or certification audit. MRC’s original guidance is contained in measurement guidelines maintained by the IAB.”

The document explains, “Among the reasons why traffic may be deemed invalid is it is a result of non-human traffic (spiders, bots, etc.).”

The MRC IVT Standards addendum discusses the role of IP address, user-agent, and industry reference lists in detecting invalid traffic (among many other topics and details).

Source: Media Rating Council

The “MRC is requiring filtration of invalid data-center traffic originating from IPs associated to the three largest known hosting entities: Amazon AWS, Google and Microsoft. This means filtration of IPs within those of known hosting entities determined to be a consistent source of invalid traffic not including routing artifacts of legitimate users or virtual machine legitimate browsing.

Source: Media Rating Council

The Media Rating Council (MRC) states: “The following techniques shall be employed by the measurement organization to the extent necessary to filter material General Invalid Transactions.” The list includes a list of “parameter based detection”, which includes “non-Browser User-Agent Header” and “Known Invalid Data-Center Traffic”.

Background - Media Rating Council (MRC) Accreditation for IVT Detection

The Media Rating Council has provided accreditations to various vendors for invalid traffic detection and/or filtration.

One can visit the website of the Media Rating Council to see what accreditations various ad tech vendors have received from the MRC.

In 2021, HUMAN Security (f/k/a “White Ops) announced that it was the first company to “Receive MRC Accreditation for Both Pre-Bid and Post-Bid Invalid Traffic Detection and Mitigation Across Desktop, Mobile Web, Mobile In-App and Connected TV”. The announcement states this “includes the first-ever accreditation for pre-bid detection and mitigation of SIVT of its Advertising Integrity product (the solution formerly known as MediaGuard) across all platforms.”

HUMAN’s press release stated: “White Ops today verifies more than ten trillion digital interactions per week, working directly with the largest internet platforms, DSPs and exchanges. With White Ops Advertising Integrity, platforms can tap into the most comprehensive pre-bid prevention and post-bid detection capabilities to verify the validity of advertising efforts across all channels. The White Ops bot mitigation platform uses a multilayered detection methodology to spot and stop sophisticated bots and fraud by using technical evidence, continuous adaptation, machine learning and threat intelligence. In most cases, White Ops delivers responses to partners in 10 milliseconds or less before a bid is made, saving time and money and ensuring that their advertising inventory can be trusted and fraud-free.

The MRC’s website indicates that HUMAN Security (f/k/a “White Ops) has received MRC accreditations for “Pre-bid IVT detection” and “Sophisticated Invalid Traffic Detection/Filtration”.

Screenshot showing that HUMAN Security has received an accreditation from the Media Rating Council for “Sophisticated Invalid Traffic Detection/Filtration”. Source: Media Rating Council https://mediaratingcouncil.org/accreditation/digital

Other ad tech vendors that have received accreditations from the Media Rating Council include Integral Ad Science (IAS) and DoubleVerify.

In the screenshot below from the MRC’s website, one can see that IAS has received MRC accreditation for “Sophisticated Invalid Traffic Detection/Filtration”.

Screenshot showing that Integral Ad Science (IAS) has received an accreditation from the Media Rating Council for “Sophisticated Invalid Traffic Detection/Filtration”. Source: Media Rating Council https://mediaratingcouncil.org/accreditation/digital

The screenshot below shows that DoubleVerify has received MRC accreditation for “Pre-bid IVT detection” and “Sophisticated Invalid Traffic Detection/Filtration”.

The Media Rating Council website explains that for DoubleVerify’s “Pre-bid IVT detection”, “Pre-bid available to DV enterprise clients as pre-bid filtration, and to DV agency/advertiser clients as individual pre-bid segments and with the Authentic Brand Suitability segment”.

Screenshot showing that the DoubleVerify Quality Analytics Platform has received an accreditation from the Media Rating Council for “Sophisticated Invalid Traffic Detection/Filtration” and “Pre-bid IVT detection”. Source: Media Rating Council https://mediaratingcouncil.org/accreditation/digital

Background - Ad tech vendors which partner with HUMAN Security

HUMAN Security’s “Ad Fraud Defense” marketing page declares: “Safeguard your inventory in pre-bid environments. HUMAN Ad Fraud Defense analyzes bid requests—before bid initiation—to stop invalid traffic from entering your inventory. By transacting in a fraud-free environment, your reputation improves, protecting your revenue and driving demand from partners.”

Source: HUMAN Security

HUMAN Security’s marketing page for its “MediaGuard” product states that “For each of the 2 trillion-plus interactions verified by HUMAN each day, up to 2,500 signals are parsed through over 350 algorithms to reach a single critical decision - bot or not.

Source: HUMAN Security

HUMAN Security’s website explains that the way that MediaGuard works is through “Powerful detection techniques are applied on a per-session analysis of each impression using HUMANs' internet-scale observability to decipher between real humans and sophisticated bots.

Source: HUMAN Security

HUMAN Security’s public technical documentation states: “MediaGuard is HUMAN's pre-bid solution for detecting advertising fraud. By analyzing the bid requests you receive, MediaGuard makes real-time predictions about the validity of the users driving each bid request, then provides you with a recommendation to bid or not bid on the associated ad inventory. This pre-bid verification gives you the necessary information to eliminate different forms of invalid traffic from your bidding workflows before you serve impressions to those users.

HUMAN further explains that “MediaGuard is an intermediary service that sits between the bid requests you receive and the bids that you actually make. After you've set up a MediaGuard integration, you'll be able to send data to MediaGuard and receive real-time predictions about your bid requests. Every time you receive a bid request, you'll either need to extract certain details from that bid request and send them to MediaGuard or send the bid request's entire OpenRTB BidRequest object to MediaGuard. MediaGuard analyzes these details to determine whether the user driving that bid request is valid or invalid. Any invalid bid requests are sorted into different IVT categories with information about the specific traits displayed by that bid request. MediaGuard then returns a real-time prediction that you can incorporate into your bidding workflows. For example, you might configure your bids to automatically drop any bid requests that receive an IVT=true response from MediaGuard. You can also view your MediaGuard data in the HUMAN Dashboard or via our Reporting API.

Source: HUMAN Security Media Overview technical documentation

Many ad tech vendors have made public announcements about partnering with HUMAN Security (f/k/a “White Ops”), including using its MediaGuard product to check their ad inventory pre-bid.

For example, Index Exchange SSP stated in a press release:

“Index Exchange Partners with White Ops to Deliver Invalid Traffic Protection Against Sophisticated Bots Across All Global Inventory Channels”; “Index Exchange (IX), one of the world’s largest independent ad exchanges, and White Ops, the global leader in collective protection against sophisticated bot attacks and fraud, today announced an expanded partnership that enhances Index Exchange’s global inventory across all channels and regions. Through White Ops’ comprehensive protection, the partnership protects the entirety of Index Exchange’s global inventory. It allows buyers to purchase from IX’s emerging channels, such as mobile app and Connected TV (CTV), with confidence that its supply chain is protected against invalid traffic before a bid request is ever sent to a DSP and made eligible (emphasis added).

Yieldmo SSP stated:

White Ops, the global leader in bot mitigation, has announced a partnership with Yieldmo, one of the world’s largest independent mobile ad marketplaces, to protect its programmatic demand partners from sophisticated bot attacks. [...] MediaGuard is a bot prevention API using machine-learning algorithms that learn and adapt in real-time to accurately block bot-driven ad requests before a buyer has the opportunity to bid on them. Eliminating fraudulent ad requests earlier in the bid process results in better overall performance for advertisers and yield for publishers. By leveraging White Ops’ Bot Mitigation Platform, Yieldmo can accurately block sophisticated bots and pre-bid IVT without compromising the speed of their website and online operations. MediaGuard prevents bots and IVT across desktop, mobile web, mobile app, and connected TV (CTV) environments in real time before an impression is served.

Triplelift SSP stated in 2018:

Earlier this month our team announced that in partnership with White Ops, TripleLift would become the first native exchange to offer third-party verified pre-bid fraud prevention, across all of our inventory. This partnership allows TripleLift to leverage White Ops’s MediaGuard, to assist in the pre-bid prevention of fraud across the entire exchange.

Pubmatic SSP stated:

White Ops today announced its partnership with sell-side platform (SSP) PubMatic, to defend against fraudulent, non-human traffic impressions across PubMatic’s video and mobile inventory. Through this partnership, PubMatic has globally implemented White Ops' pre-bid and post-bid solutions to provide ad fraud detection and prevention.

Demand side platform Trade Desk announced:

“White Ops and The Trade Desk [...] today announce a landmark deal that completely changes how the advertising industry tackles fraud. White Ops’ Human Verification technology will aim to ensure that there is a human on the other end of every impression served on The Trade Desk that runs through White Ops, in real time, protecting global advertisers and agencies from buying fraudulent impressions [...] For too long, invalid traffic has been part of our industry,” said Jeff Green, CEO and co-founder of The Trade Desk. “There’s no level of fraud that is acceptable. Our partnership with White Ops means that we are the first advertising platform to block non-human impressions at the front door. [...] ​​As part of this initiative, White Ops and The Trade Desk will co-locate servers and data centers in North America, Europe and Asia, to scan every biddable ad impression in real-time. [...] When a non-human impression, known as “Sophisticated Invalid Traffic (SIVT)” within the advertising industry, is identified by White Ops, The Trade Desk will block that impression from serving. The intent is this technology will be applied to every impression The Trade Desk bids on that runs through White Ops, on a global basis. [...] Unlike other solutions, the goal here is to run all impressions across The Trade Desk’s platform through White Ops, not just sampled impressions. Additionally, the Trade Desk has collaborated with the leading SSPs to bring a unified solution to market.” (emphasis added).

Screenshot of HUMAN Security’s website, showing a press release from 2017

Screenshot of Trade Desk’s website, showing the firm partners with HUMAN Security

https://www.adweek.com/performance-marketing/the-trade-desk-and-white-ops-are-teaming-up-to-block-bot-traffic-before-advertisers-buy-it/

Xandr (f/k/a AppNexus) and now owned by Microsoft, announced:

Xandr [...] and HUMAN, a cybersecurity company best known for collectively protecting enterprises from bot attacks, today announced an expansion to HUMAN’s existing pre-bid bot protection within the Xandr platform, to provide an additional layer of protection against fraud and sophisticated bot attacks as emerging formats like connected TV (CTV) scale in availability and demand. This integration connects the full breadth of HUMAN’s Advertising Integrity suite of ad fraud detection and prevention solutions to Xandr’s buy-and sell-side platforms [...] Xandr protects its platform before a bid is even made—including within CTV—to continue delivering success to its publishers and advertisers. HUMAN recently became the first company to receive accreditation from the Media Rating Council (MRC) for pre and post-bid protection against Sophisticated Invalid Traffic (SIVT) for desktop, mobile web, mobile in-app, and CTV.

Many ad tech vendors are part of the HUMAN Security’s “Human Collective.”

List of Flagship and Founding Members of HUMAN Security’s “Human Collective”. Source: https://www.humansecurity.com/company/the-human-collective. Participation in the Human Collective entails: “Technology - Powered by HUMAN's proprietary technology and Modern Defense Platform, we can ensure members are protecting themselves and each other.”

A list of vendors who issued public statements or claimed to have partnered with HUMAN Security (f/k/a “White Ops”) can be seen below.

The Trade Deskhttps://www.humansecurity.com/newsroom/the-trade-desk-partners-with-white-ops-to-become-first-advertising-platform-to-block-fraudulent-impressions-before-they-are-purchased8/31/2017
Microsoft Xandrhttps://www.prnewswire.com/news-releases/appnexus-expands-inventory-quality-initiative-through-partnership-with-white-ops-300666958.html6/15/2018
Index Exchangehttps://www.humansecurity.com/newsroom/index-exchange-partners-with-white-ops8/12/2020
Googlehttps://support.google.com/campaignmanager/answer/6076504?hl=enN/A
Sovrnhttps://www.humansecurity.com/newsroom/the-human-collective-grows-more-than-5x-since-april-launch10/13/2021
Yieldmohttps://www.humansecurity.com/learn/blog/yieldmo-and-white-ops-ensuring-squeaky-clean-mobile-inventory9/10/2019
Sharethroughhttps://www.sharethrough.com/blog/sharethrough-founding-member-of-the-human-collective-in-effort-to-keep-ads-human10/27/2021
Kargohttps://www.humansecurity.com/newsroom/kargos-curated-marketplace-boasts-less-than-0.3-of-invalid-traffic-following-implementation-of-white-ops-mediaguard5/29/2019
Gumgumhttps://www.businesswire.com/news/home/20190529005284/en/GumGum-Partners-with-White-Ops-to-Deliver-Comprehensively-Safe-Ad-Exchange5/29/2019
Triplelifthttps://www.humansecurity.com/newsroom/triplelift-and-white-ops-partner-to-fight-fraud-in-native-advertising5/1/2018
Magnitehttps://www.businesswire.com/news/home/20161220005758/en/Joint-Statement-From-White-Ops-CEO-and-Rubicon-Project-President-on-Successful-Efforts-Countering-Russian-Ad-Fraud12/20/2016
Pubmatichttps://www.humansecurity.com/newsroom/pubmatic-partners-with-white-ops-to-fight-bot-fraud-and-drive-higher-transparency-in-video-inventory12/5/2017
Sonobihttps://www.prnewswire.com/news-releases/sonobi-partners-with-human-formerly-white-ops-to-safeguard-platform-from-sophisticated-bot-fraud-301286928.html5/11/2021
Freewheelhttps://www.humansecurity.com/newsroom/freewheel-and-white-ops-expand-partnership-globally-to-further-deepen-trust-in-premium-video-inventory4/29/2020
Media.nethttps://www.humansecurity.com/newsroom/white-ops-media.net-partnership-extends-pre-bid-fraud-protection-for-brands5/15/2019
Beachfronthttps://www.humansecurity.com/newsroom/white-ops-beachfront5/15/2018
Primishttps://www.humansecurity.com/newsroom/primis-expands-partnership-with-white-ops-in-fight-against-fraud-to-create-a-clean-and-trusted-video-supply-chain9/1/2020
Omnicomhttps://www.humansecurity.com/newsroom/human-formerly-white-ops-launches-the-human-collective-to-protect-against-bot-attacks-and-fraud-across-advertising-supply-chain4/14/2021

Background - What is “Pre-bid targeting” in digital advertising?

In programmatic digital advertising, “Real Time Bidding” (RTB or OpenRTB) is a protocol that governs digital ad auctions, where publishers and advertisers can offer to sell and buy ad placements (respectively).

Before an advertiser “bids” on a given programmatic ad auction, “pre-bid” data segments and solutions can inform the ad bidding process.

For example, according to the IAB and Integral Ad Science (IAS), an ad verification vendor, the way “Pre-bid targeting” informs real time bidding is as follows:

  1. “Before bidding, check the relevant media quality filter within your DSP to disqualify pages that do not meet requirements”

  2. “Bids are placed on qualified inventory”

  3. “Pre-bid targeting fees are only applied to winning bids”

Source: IAB and IAS

Background - Integral Ad Science (IAS) ad fraud solutions

Integral Ad Science (IAS) is an ad verification company that has received multiple Media Rating Council accreditations, and is TAG “Certified Against Fraud”.

IAS provides ad verification solutions to both advertisers and to media publishers.

IAS offers “pre-bid” targeting data solutions in various ad buying platforms (known as “demand side platforms” or DSPs), such as Trade Desk, Beeswax, and Google DV360.

Source: IAS

IAS’s website states that “utilizing IAS pre-bid segments” can help advertisers “Only bid on quality Inventory” in programmatic ad auctions.

Source: IAS

IAS’s website states: “IAS is a leader in the fight against ad fraud through a balanced approach and ongoing research. Our team of specialized analysts, engineers, white-hat-hackers, and data scientists is taking on invalid traffic from every angle to create the most advanced solutions in the market [...] Ensure the most precise fraud detection possible with our three-pillar approach. Our fraud technology is based on a set of methodologies that when used in tandem, detect the evolving threat of ad fraud with incredible accuracy.”

Source: IAS

According to IAS’ public documentation, IAS offers a pre-bid segment to prevent ad fraud or “suspicious activity”. These segments can help “remove fraudulent bot traffic from your campaign.”

Screenshot of IAS technical documentation, describing IAS “Suspicious Activity” Pre-Bid Segments; Source: https://go.integralads.com/rs/469-VBI-606/images/IAS_Xandr_User_Guide.pdf; Archived: https://perma.cc/93NM-2S2Q

According to IAS’ public documentation, IAS’ Ad Fraud Solution utilized multiple detection methodologies for maximum protection. This three-pillar approach is marketed as being “powered by unmatched scale and machine learning, providing the most accurate detection & prevention.” The IAS Ad Fraud Solution claims to use “rules-based detection [...] to identify any anomalous behavior patterns”, “AI/Machine Learning [...] using big data to detect any hidden, uncommon patterns”, and “malware analysis & reverse engineering to uncover any emerging threats.”

Screenshot of IAS technical documentation, describing IAS Ad Fraud Solution; Source: https://go.integralads.com/rs/469-VBI-606/images/IAS_Xandr_User_Guide.pdf; Archived: https://perma.cc/93NM-2S2Q

According to IAS’ public documentation, IAS offers a “Quality Sync” Pre-bid segment which “automatically matches your pre- and post-bid settings” including “Ad Fraud (IVT)” settings.

Screenshot of IAS technical documentation, describing IAS Quality Sync Pre-Bid Segments; Source: https://go.integralads.com/rs/469-VBI-606/images/IAS_Xandr_User_Guide.pdf; Archived: https://perma.cc/93NM-2S2Q

Background - DoubleVerify ad fraud solutions

DoubleVerify (DV) is an ad verification company that has received multiple Media Rating Council accreditations, and is TAG “Certified Against Fraud”. DoubleVerify’s website statesDV analyzes over 2 billion impressions daily, identifying comprehensive fraud and SIVT — from hijacked devices to bot fraud and injected ads. We're accredited by the Media Rating Council (MRC) for monitoring and blocking across devices — including mobile app, and our AI-backed deterministic methodology results in greater accuracy, fewer false positives and, ultimately, superior protection for brands.

DoubleVerify provides ad verification solutions to both advertisers and to media publishers.

DoubleVerify offers “pre-bid targeting” solutions, which block before an “impression is purchased”, including “pre-bid avoidance segments”.

DoubleVerify’s public website states: “DV offers the most comprehensive and accurate pre-bid avoidance targeting available in the market. Pre-bid avoidance targeting helps brands drive efficiency from their programmatic media spend by preventing their bidding on auctions misaligned with their ad delivery standards. The solution also helps publishers by preventing the sale of impressions that ultimately result in a block and potential forfeit opportunity [...] Our millions of fraud signatures are updated nearly 100 times per day (every 15 minutes) in our DSP integrations to ensure near-immediate programmatic protection from invalid traffic.

DoubleVerify further states: “DV analyzes over 2 billion display and video impressions daily and provides the fastest, most complete fraud identification and protection available — across web, mobile app and CTV environments. Our AI-backed deterministic methodology results in greater accuracy, fewer false positives and, ultimately, superior protection. We pre-qualify your supply to ensure you offer only quality inventory to your advertiser partners — maximizing buyer value and campaign effectiveness. DV works with platform partners in a variety of ways. We can integrate our data directly into your platform to evaluate inventory quality at its source, provide the data needed to package high-quality inventory into easily accessible segments for advertisers, or integrate our metrics into your platform to make optimization easier for buyers. In all instances, you benefit from DV's trusted data, helping to enhance the value and marketability of your inventory.”

Background - Prior research on bot detection and filtration vendors’ technology by Shailin Dhar - “Mystery Shopping Inside the Ad Fraud Verification Bubble”

The Financial Times previously cited Shailin Dhar in 2017, an ad tech industry expert, who conducted research on detecting bots and ad fraud.

“Mr. Dhar remains skeptical about how far the industry will get in its fraud-fighting efforts, given that many constituents have a financial incentive to maintain the status quo. “Ad tech companies have made billions of dollars a year from fraudulent traffic,” he says. “Fraud is built into the foundation of advertising supply.””

Shailin Dhar previously commented in a Linkedin post: “

“Advertisers, Why do we spend our efforts chasing “super sophisticated botnets” operated by the worlds “most devious cybercriminals”, when we haven’t stopped basic data-center/server-farms from eating up ad budgets?”

Source: https://www.linkedin.com/pulse/marketers-stop-distracting-yourself-focus-ad-fraud/

According to an AdMonsters article,

“In 2016, fraud researcher and consultant Shailin Dhar performed an experiment to test how well major ad verification services were able to root out fraudulent traffic. The methodology was as follows:

  1. Buy a website that hasn’t been flagged on the open exchanges, put up some stolen content, ad tags, and the pixels from several ad verification vendors.

  2. Buy a ton of fraudulent botnet CPC traffic from a shady vendor and send that traffic to the newly purchased site.

  3. See what vendor was able to catch the crap.

To put it in clickbait terms, the results were shocking. One of the best-known vendors in the space was only able to catch 14% of bot traffic, meaning that most of what they analyzed, in their opinion, was safe for advertising. Other vendors fared slightly better, but on a whole, the experiment showed verification companies aren’t to be completely trusted.”

Screenshot of an AdMonsters.com article from 2017.

According to Shailin Dhar’s research from 2017, Integral Ad Science was able to correctly identify only 17% of the bot traffic Mr. Dhar paid to run against an artificial website he created. Mr. Dhar says: “83% of the robotic traffic we purchased was considered human by the Integral Ad Science filter. When I informed the source traffic vendor that the rate was 17% they said that the Integral Ad Science sampling got lucky because it’s usually around 5%”.

Source: “Mystery Shopping Inside the Ad Fraud Verification Bubble

Background: US Senators Letters ask Federal Trade Commission to investigate “willful blindness to fraudulent activity in the online ad market”

In 2016, two US Senators - Mark Warner and Chuck Schumer - wrote a letter to the Federal Trade Commission (FTC), asking the agency to take a “closer look at the negative economic impact of digital advertising fraud on consumers and advertisers”.

According to AdExchanger, the “senators contend that ad fraud unchecked will trigger a rise in wasted marketing costs that will eventually get passed down to the consumer in the form of higher prices for goods and services.

The Senators wrote: “The cost of pervasive fraud in the digital advertising space will ultimately be paid by the American consumer in the form of higher prices for goods and services [...] It remains to be seen whether voluntary, market-based oversight is sufficient to protect consumers and advertisers.

The Senators also asked: "To the extent that criminal organizations are involved in perpetuating digital advertising fraud, how is the FTC coordinating with both law enforcement (e.g., the Department of Homeland Security or the Federal Bureau of Investigation) and the private sector formulate an appropriate response?"

The Senators wrote another letter in 2018, sayingI am writing to express my continued concern with the prevalence of digital advertising fraud, and in particular the inaction of major industry stakeholders in curbing these abuses. In 2016, Senator Schumer and I wrote Chairwoman Ramirez to express frustration with the growing phenomenon of digital ad fraud. Digital ad fraud has only grown since that time”.

The Senators asked the Federal Trade Commission to investigate “the extent to which major ecosystem stakeholders engage in willful blindness to fraudulent activity in the online ad market.

Adalytics public interest research objectives

The following outlines Adalytics’ research objectives which are believed to be in the public interest.

  1. Did any advertisers have their ads served to bots operating out of data center server farms?

    1. Did any US government or US military advertisers have their ads served to bots by ad tech vendors?

    2. Did any non-profit, NGO, or charity advertisers have their ads served to bots by ad tech vendors?

  2. Were any ads served to bots by vendors who made public claims about filtering out bot traffic before an ad impression was served? If so, which vendors who made such public proclamations were observed serving ads to bots?

Research Methodology

This empirical, observational research study did not actively manipulate independent control variables or seek to generate new data. Rather, the study was premised on observing and analyzing data that was already generated by other entities.

Detecting bot traffic is an ever-changing discipline. Bot detection vendors continuously upgrade their detection algorithms, whilst entities who operate bots for benign and malicious purposes seek to circumvent those detections by applying increasingly sophisticated or novel evasions.

This study circumvents the difficulties of accurately detecting bot traffic by instead relying entirely on data wherein the ground truth is universally known. Specifically, this observational research study chose to source 100% of observations and data from three bot operators who are not actively seeking to commit ad fraud.

As such, there can be no ambiguity or debate about whether the ads served in these contexts were served to humans or bots; there is absolute certainty that every single one of these ad impressions is confirmed to have been served to a bot, given the provenance of the data.

To the best of our knowledge, this study constitutes the largest analysis of declared bot traffic in the context of digital advertising.

Research Methodology - Bot web traffic source #1 - HTTP Archive

The first source of bot traffic data was HTTP Archive. The HTTP Archive is part of the Internet Archive, a 501(c)(3) non-profit.

The HTTP Archive “Tracks How the Web is Built.” HTTP Archive states that they “periodically crawl the top sites on the web and record detailed information about fetched resources, used web platform APIs and features, and execution traces of each page. We then crunch and analyze this data to identify trends — learn more about our methodology.”

Screenshot of the HTTP Archive home page - https://httparchive.org/

The HTTP Archive publishes academic research reports about the state of web technology trends on the open internet.

Screenshot of the HTTP Archive “All Reports” page - https://httparchive.org/reports

The HTTP Archive crawls millions of URLs on both desktop and mobile monthly. “The URLs come from the Chrome User Experience Report, a dataset of real user performance data of the most popular websites. The list of URLs is fed to our private instance of WebPageTest on the 1st of each month. As of March 1 2016, the tests are performed on Chrome for desktop and emulated Android (on Chrome) for mobile. The test agents are run from Google Cloud regions across the US. Each URL is loaded once with an empty cache ("first view") for normal metrics collection and again, in a clean browser profile, using Lighthouse. The data is collected via a HAR file. The HTTP Archive collects these HAR files, parses them, and populates various tables in BigQuery.

Screenshot of the HTTP Archive’s FAQ page - https://httparchive.org/faq

The HTTP Archive crawls websites using desktop and mobile user agents. Per the HTTP Archive’s methodology, it crawls from Google Cloud data centers. The HTTP Archive openly and transparently declares its crawlers via the HTTP User Agent request header. Specifically, HTTP Archive declares that its crawlers declare the following desktop and mobile user-agents:

  • Desktop: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.61 Safari/537.36 PTST/220609.133020

  • Mobile: Mozilla/5.0 (Linux; Android 8.1.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.115 Mobile Safari/537.36 PTST/220609.133020

Screenshot of the HTTP Archive’s methodology page - https://almanac.httparchive.org/en/2022/methodology

“PTST” is a well known and established bot user agent. One can check the publicly available user-agent.net database to confirm that “PTST” appears as a “Bot/Crawler”.

Screenshot of the User-Agents.net database, showing that “PTST” is a known “Bot/Crawler” - https://user-agents.net/string/mozilla-5-0-x11-linux-x86-64-applewebkit-537-36-khtml-like-gecko-chrome-102-0-5005-61-safari-537-36-ptst-220609-133020

UserAgents.io - another public user agent database - also confirms that “PTST” is a known Crawler.

Screenshot of UserAgents.io - another public user agent database - showing that “PTST” is a known Crawler.

“PTST” has also been on the IAB Tech Lab Spiders and Bots robots list since 2013. “The IAB Tech Lab publishes a comprehensive list of such Spiders and Robots that helps companies identify automated traffic such as search engine crawlers, monitoring tools, and other non-human traffic that they don’t want included in their analytics and billable counts.” The “spiders and robots list supports the MRC’s General Invalid Traffic Detection and Filtration Standard by providing a common industry resource and list for facilitating IVT detection and filtration.

HTTP Archive crawls from Google Cloud data center server farms. The IPs operated by Google Cloud are publicly documented on Google’s website - https://support.google.com/a/answer/10026322?hl=en

Screenshot of Google’s public documentation, showing how one can obtain Google Cloud’s IP address ranges

As a reminder, the Media Rating Council (MRC) - an advertising accreditation body says “requiring filtration of invalid data-center traffic originating from IPs associated to the three largest known hosting entities: Amazon AWS, Google and Microsoft. This means filtration of IPs within those of known hosting entities determined to be a consistent source of invalid traffic not including routing artifacts of legitimate users or virtual machine legitimate browsing” (emphasis added).

The HTTP Archive has crawled tens of millions of distinct page URLs over the course of several years.

Screenshot of the HTTP Archive Reports section, showing how many page URLs are crawled - https://httparchive.org/reports/state-of-the-web#numUrls

The HTTP Archive makes peta-bytes of web traffic data generated through its crawls as a Public Dataset on Google BigQuery.

Screenshot of the HTTP Archive Google BigQuery dataset - https://httparchive.org/faq#how-do-i-use-bigquery-to-write-custom-queries-over-the-data

In the screenshot below, one can see examples of how a user can interface with and query the HTTP Archive dataset.

Example screenshot showing how to use Google BigQuery to query the public HTTP Archive dataset - https://har.fyi/guides/getting-started/

Example screenshot showing how to use Google BigQuery to query the public HTTP Archive dataset - https://har.fyi/guides/getting-started/

Querying the HTTP Archive dataset allows one to observe whether specific ad tech vendors transacted and/or served ads to declared bots in a known Google Cloud data center.

The Media Rating Council (MRC) states in its “Invalid Traffic Detection and Filtration Standards Addendum” that the “MRC is requiring filtration of invalid data-center traffic originating from IPs associated to the three largest known hosting entities: Amazon AWS, Google and Microsoft” (emphasis added).

Research Methodology - Bot web traffic source #2 - Anonymous web crawler vendor

The second bot dataset analyzed for the purposes of this research report came from an anonymous web crawler vendor. This vendor shared their web crawling data under condition of anonymity. The vendor crawls the web for various benign reasons - their purpose is not to commit ad fraud or generate ad revenue.

The given vendor crawls approximately seven million websites each month, from several dozen known data center IP addresses, such as Hertzner and Linode data center IPs.

The vendor does not declare via its user-agent HTTP request header that it is a bot; instead, the vendor declares “valid, human” browser user agents when visiting various websites. Thus, the bot’s user HTTP User-Agent request header would likely not be located on the IAB’s list of known bots and spiders. The bot utilizes different IP addresses when initiating its crawl. The bot will occasionally scan the web from data center IPs as well as from residential proxy IPs. Thus, this bot’s browsing activity may possibly in some cases meet the MRC definition of “Sophisticated Invalid Traffic” (SIVT).

Research Methodology - Bot web traffic source #3 - URLScan.io

The third bot dataset used for this empirical research report was URLScan.io. URLScan.io is a “website scanner for suspicious and malicious URLs.” Its mission is to “allow anyone to easily and confidently analyze unknown and potentially malicious websites.”

When a URL is submitted to urlscan.io, an automated process will browse to the URL like a regular user and record the activity that this page navigation creates. This includes the domains and IPs contacted, the resources (JavaScript, CSS, etc) requested from those domains, as well as additional information about the page itself. urlscan.io will take a screenshot of the page, record the DOM content, JavaScript global variables, cookies created by the page, and a myriad of other observations.”

URLScan.io does not always declare via its user-agent HTTP header that it is a bot; instead, the vendor’s bot appears to declare “valid, human” browser user agents sometimes when visiting various websites. For example, the URLScan.io bot will declare itself to be a “normal” Chrome browser on a Windows or Linux desktop machine, rather than identifying itself as a bot. Thus, in some cases the URLScan.io’s bot’s HTTP user-agent request header would likely not be found on the IAB’s list of known bots and spiders. Furthermore, the URLScan.io bot appears to utilize different IP addresses when initiating its crawl. URLScan.io appears to occasionally scan the web from data center IPs as well as from datacenter-based Virtual Private Network (VPN) or residential proxy IPs. Thus, ads served to URLScan.io’s crawler may possibly in some cases meet the MRC definition of “Sophisticated Invalid Traffic” (SIVT).

URLScan.io has a powerful search API that allows users to look for specific previous or historical URLScan.io bot crawls. For example, in the screenshot below, one can see a search to find all instances where URLScan.io scanned the website of the Naval Criminal Investigative Service (ncis.navy.mil). One can see that URLScan.io scanned the NCIS website fifteen times in the last six years.

Screenshot of URLScan.io’s search page, showing fifteen instances where URLScan.io’s bot scanned and screenshotted the NCIS website.

One can click on each individual URLScan.io scan to visualize the results of that scan. For example, one can see that URLScan.io’s bot crawled ncis.navy.mil on September 19th, 2024, and created a screenshot of the page.

Screenshot of a URLScan.io bot crawling and screenshotting the ncis.navy.mil website on September 19th, 2024. Source: https://urlscan.io/result/559c2119-a238-48e8-b7e9-5ba5d95a529b/

One can use URLScan.io to identify instances where various ad tech vendors served ads to bots. For example, in the screenshot below, one can see a US Navy recruiting ad that was served to URLScan.io’s bot on June 16th, 2023, when the bot was scanning msn.com.

Screenshot of a URLScan.io bot crawl of msn.com on June 16th, 2023, with a US Navy recruiting ad served to the bot. Source: https://urlscan.io/result/89b8d0eb-0fbd-4785-98a8-ea70cf9e2954/

Analyzing the source code of the US Navy ad shows that the ad was transacted by the Trade Desk DSP and Index Exchange SSP (seller ID 185185). The source code of the US Navy ads contains references to “charge-allDoubleVerifyBotAvoidance”. Index Exchange and HUMAN Security also has issued public statements that: it has an “expanded partnership that enhances Index Exchange’s global inventory across all channels and regions. Through White Ops’ comprehensive protection, the partnership protects the entirety of Index Exchange’s global inventory. It allows buyers to purchase from IX’s emerging channels, such as mobile app and Connected TV (CTV), with confidence that its supply chain is protected against invalid traffic before a bid request is ever sent to a DSP and made eligible” (emphasis added).

Screenshot of URLScan.io’s website, showing a URLScan.io bot crawl from June 16th, 2023. One can see a US Navy ad that was mediated by Trade Desk, Index Exchange, DoubleVerify, and/or HUMAN Security on msn.com.

Screenshot of URLScan.io’s HTTP requests tab for a URLScan.io bot crawl on June 16th, 2023. The screenshot shows an HTTP request to “bid.adsrv.org/bid/feedback/casale” - a “win notification” pixel that fires when a Trade Desk ad transacts via Index Exchange SSP on msn.com

Research Methodology - Examination of putative bot traffic classifications from IAS on various sites

In addition to providing services to advertisers, Integral Ad Science (IAS) also accepts financial payments from and provides services to media publishers.

For example, one can see that IAS made $15.7 million dollars in revenue from publishers in Q1 2024, according to IAS’s 10-Q form.

Screenshot of IAS’s Q1 2024 10-Q form, showing that IAS made $15.7 million in revenue providing services to publishers

IAS releases publisher specific tools, some of which appear to be invoked before any ads are served or ad auctions run on a given page. Specifically, IAS’s public “publisher optimization implementation guide” provides a technical overview of its publisher tools.

Screenshot of Integral Ad Science (IAS)’s public “Publisher Optimization Implementation Guide” - Original Link: https://helpcenter.integralplatform.com/article/publisher-optimization-implementation-guide; Archived: https://perma.cc/G6EZ-76VR;

IAS’s public “Publisher Optimization Implementation Guide” explains that the “The desktop and mobile web environment uses a JavaScript library. Publisher Optimization returns JSON with the response data as key/value targeting parameters.” This is loaded via a specific Javascript file called: “static.adsafeprotected.com/iasPET.1.js”.

Screenshot of Integral Ad Science (IAS)’s public “Publisher Optimization Implementation Guide

The IAS Publisher Optimization tool “provides key/values for: [...] whether the current ad represents invalid traffic (IVT).”

Screenshot of Integral Ad Science (IAS)’s public “Publisher Optimization Implementation Guide” - Original Link: https://helpcenter.integralplatform.com/article/publisher-optimization-implementation-guide; Archived: https://perma.cc/G6EZ-76VR;

IAS’s “publisher optimization implementation guide” reveals that the vendor uses the “fr” key to indicate “IAS Invalid Traffic”. Possible values for the “fr” key include “fr=true” or “fr=false. The guide explains that this indicates “Whether the ad represents invalid traffic (for example, the visitor is a bot rather than a human).” “fr=true” represents the user is classified as a “bot”, whilst “fr=false” represents the user is classified as a human.

Screenshot of Integral Ad Science (IAS)’s public “Publisher Optimization Implementation Guide” - Original Link: https://helpcenter.integralplatform.com/article/publisher-optimization-implementation-guide; Archived: https://perma.cc/G6EZ-76VR;

In the screenshot below, one can see an example of IAS’s publisher optimization tools returning an HTTP response from the endpoint “pixel.adsafeprotected.com/services/pub”, wherein the “fr” value is set to “fr=false”.

Screenshot of chrome developer tools, showing that “fr: false” in the lower right hand corner, for IAS’s endpoint “pixel.adsafeprotected.com/services/pub” on usatoday.com.

Given a sample dataset of 100% confirmed bot traffic as a ground truth baseline, one can perform a calibration exercise to statistically estimate the statistical sensitivity (“True Positive Rate”) of the IAS publisher optimization software for detecting bot traffic.

For example, for a set of bot crawls, one can see whether IAS’s “pixel.adsafeprotected.com” endpoint returned a value of “fr=true” or “fr=false”. In the screenshot below, one can see an instance where URLScan.io’s bot crawled thetimes.com on October 20th, 2024. One can see that, when the bot was crawling thetimes.com, IAS’s server endpoint “pixel.adsafeprotected.com/services/pub” was invoked. That HTTP endpoint returned a JSON response, with the “fr” key value set to “false”, as in “fr=false”.

Composite screenshot of URLScan.io’s bot crawling thetimes.com on October 20th, 2024. The screenshot shows HTTPS requests to “pixel.adsafeprotected.com”, which elicited an HTTPS response with “fr”: “false”.

Research Methodology - Digital forensics of ad source code

When an ad is served to a bot in a data center and recorded by the bot (in the form of an HTTP network traffic logfile), one can inspect the source code of the HTTP requests and resources used to serve the ad to the given bot.

Analyzing source code of digital ads allows one to identify which Supply Side Platforms (SSPs) or ad exchanges were involved in transacting a given ad. For example, if an ad was served via the TripleLift header bidding adapter or contained an image pixel whose “src” (source) attribute was the endpoint “tlx.3lift.com/s2s/notify” or “tlx.3lift.com/header/notify”, the given ad was labeled as having been transacted by TripleLift ad exchange. Similarly, if an ad was served contained an image pixel whose “src” attribute was the endpoint “bid.adsrvr.org/bid/feedback” or “us-east-1.event.prod.bidr.io/log/imp/”, the given ad was was labeled as having been transacted by Trade Desk or Beeswax DSP, respectively.

For a subset of digital ads observed, it appears to be possible to extract from the source code of the ad if a given brand appeared to be “charged” (invoiced) for bot avoidance or page quality services from a given vendor. The source of some ads appear to contain (in Base64 encoded text) references to specific data segments, vendors, or services that were applied for a given ad campaign.

For example, many US Navy ads observed being served to bots in data centers contained Javascript from doubleverify.com and source code from a demand side platform, which included Base64 encoded text. For example, in the screenshot below, one can see a US Navy ad served to a bot that was crawling msn.com (source: https://urlscan.io/result/89b8d0eb-0fbd-4785-98a8-ea70cf9e2954/#summary).

Inspecting the source code of the HTTP traffic that served the ad shows that the bid response was served via Microsoft Xandr’s (formerly known as AppNexus) Prebid server via the endpoint: https://ib.adnxs.com/ut/v3.

Screenshot from URLScan.io showing the individual HTTP POST request that served the US Navy ad to a bot. Source: https://urlscan.io/result/89b8d0eb-0fbd-4785-98a8-ea70cf9e2954/#transactions

One can inspect the HTTP response from Microsoft Xandr’s Prebid server to see that the bid response was transacted by Index Exchange supply side platform (which owns the domain casalemedia.com) and by the demand side platform Trade Desk.

Screenshots of source code of an HTTP bid response from Microsoft Xandr’s Prebid server, when serving a US Navy ad to a bot. Source: https://urlscan.io/responses/7e2cd77f4823cb3d71121e03e04b4d970c0dfb76439618d4f6775df06daf719c/

One can carefully analyze the Trade Desk demand side platform win notification pixel that was triggered when the US Navy ad was served to the bot.

The US Navy ad includes a DSP win notification pixel with the “dur=” query string parameter.

The value of the “dur=” query string parameter from the US Navy DSP win notification pixel is a base64 encoded string: “Cj4KIWNoYXJnZS1tYXhEb3VibGVWZXJpZnlCcmFuZFNhZmV0eSIZCNv__________wESDGRvdWJsZXZlcmlmeQo7Ch1jaGFyZ2UtYWxsVFREQ3VzdG9tQ29udGV4dHVhbCIaCNr__________wESDXR0ZGNvbnRleHR1YWwKPwoiY2hhcmdlLWFsbERvdWJsZVZlcmlmeUJvdEF2b2lkYW5jZSIZCOf__________wESDGRvdWJsZXZlcmlmeQpECiljaGFyZ2UtYWxsRGlzcGxheVZpZXdhYmlsaXR5QmlkQWRqdXN0bWVudCIXCJr__________wESCnEtYWxsaWFuY2UKMAoMY2hhcmdlLWFsbC0xIiAI____________ARITdHRkX2RhdGFfZXhjbHVzaW9ucwpICiFjaGFyZ2UtYWxsTW9hdFZpZXdhYmlsaXR5VHJhY2tpbmciIwil__________8BEg5tb2F0LXJlcG9ydGluZyoGCKCNBhgM”.

If one copies that specific “dur=” query string parameter and opens the website cyberchef.org (an online web utility or tool provided by the UK signals intelligence agency GCHQ that helps Base64 decode text), one can see the decoded value of this specific “dur=” query string parameter. cyberchef.org shows that the decoded value of the “dur=” query string parameter from the Ikea ad contains references to “charge-allDoubleVerifyBotAvoidance”.

Screenshot of the website cyberchef.org, with the “dur=” query string parameter text from the same US Navy ad that was served to a bot inputted, and the Base64 decoded text in the lower part of the screen. In the lower part of the screenshot, one can observe a reference to “charge-allDoubleVerifyBotAvoidance”. Full link: https://cyberchef.org/#recipe=From_Base64('A-Za-z0-9%2B/%3D',true,false)&input=Q2o0S0lXTm9ZWEpuWlMxdFlYaEViM1ZpYkdWV1pYSnBabmxDY21GdVpGTmhabVYwZVNJWkNOdl9fX19fX19fX193RVNER1J2ZFdKc1pYWmxjbWxtZVFvN0NoMWphR0Z5WjJVdFlXeHNWRlJFUTNWemRHOXRRMjl1ZEdWNGRIVmhiQ0lhQ05yX19fX19fX19fX3dFU0RYUjBaR052Ym5SbGVIUjFZV3dLUHdvaVkyaGhjbWRsTFdGc2JFUnZkV0pzWlZabGNtbG1lVUp2ZEVGMmIybGtZVzVqWlNJWkNPZl9fX19fX19fX193RVNER1J2ZFdKc1pYWmxjbWxtZVFwRUNpbGphR0Z5WjJVdFlXeHNSR2x6Y0d4aGVWWnBaWGRoWW1sc2FYUjVRbWxrUVdScWRYTjBiV1Z1ZENJWENKcl9fX19fX19fX193RVNDbkV0WVd4c2FXRnVZMlVLTUFvTVkyaGhjbWRsTFdGc2JDMHhJaUFJX19fX19fX19fX19fQVJJVGRIUmtYMlJoZEdGZlpYaGpiSFZ6YVc5dWN3cElDaUZqYUdGeVoyVXRZV3hzVFc5aGRGWnBaWGRoWW1sc2FYUjVWSEpoWTJ0cGJtY2lJd2lsX19fX19fX19fXzhCRWc1dGIyRjBMWEpsY0c5eWRHbHVaeW9HQ0tDTkJoZ00

As another example, New York state government (nystateofhealth.ny.gov) ads were observed being served to bots operating out of data centers. Many of the New York state government ads observed being served to bots in data centers contained Javascript from doubleverify.com and source code from a demand side platform, which included Base64 encoded text.

Screenshot of a New York state government ad creative that was served to a bot operating out of a data center by Trade Desk and OpenX. The source code of the New York state government ad contains base64 encoded references to: “charge-allDoubleVerifyBotAvoidance”

For example, a New York state government ad served was served to a bot that was crawling mail.com (source: https://urlscan.io/result/4c758e41-2ce3-4afe-8875-8ec33ed24dfe/#summary). The URLScan.io bot was operating out of an M247 data center when it was served the New York state government ad. The ad was transacted by Trade Desk and OpenX SSP.

URLScan.io bot screenshot generated when the bot was operating out of a data center and crawling mail.com. The bot was served a New York state government ad by the Trade Desk and OpenX SSP. The source code of the New York state government ad includes base64 encoded references to “charge-allDoubleVerifyBotAvoidance”

One can inspect the HTTP response from OpenX’s server to see that the bid response was transacted by OpenX supply side platform (which owns the domain openx.net) and by the demand side platform Trade Desk.

Screenshots of source code of an HTTP bid response from OpenX’s prebid endpoint, when serving a New York state government ad to a bot in a data center. Source: https://urlscan.io/responses/db26f7261aa31fa92c079a4aa9024a538d591cfc24cad7a481bb3923b6b1d5a1/

One can carefully analyze the Trade Desk demand side platform win notification pixel that was triggered when the New York state government ad was served to the bot operating out of the M247 data center.

The New York state government ad includes a DSP win notification pixel with the “dur=” query string parameter.

The value of the “dur=” query string parameter from the US Navy DSP win notification pixel is a base64 encoded string: “Cj4KIWNoYXJnZS1tYXhEb3VibGVWZXJpZnlCcmFuZFNhZmV0eSIZCNv__________wESDGRvdWJsZXZlcmlmeQo7Ch1jaGFyZ2UtYWxsVFREQ3VzdG9tQ29udGV4dHVhbCIaCNr__________wESDXR0ZGNvbnRleHR1YWwKPwoiY2hhcmdlLWFsbERvdWJsZVZlcmlmeUJvdEF2b2lkYW5jZSIZCOf__________wESDGRvdWJsZXZlcmlmeQowCgxjaGFyZ2UtYWxsLTEiIAj___________8BEhN0dGRfZGF0YV9leGNsdXNpb25z”.

If one copies that specific “dur=” query string parameter and opens the website cyberchef.org (an online web utility or tool provided by the UK’s signal intelligence agency GCHQ that helps Base64 decode text), one can see the decoded value of this specific “dur=” query string parameter. cyberchef.org shows that the decoded value of the “dur=” query string parameter from the Ikea ad contains references to “charge-allDoubleVerifyBotAvoidance”.

Source: https://cyberchef.org/#recipe=From_Base64('A-Za-z0-9%2B/%3D',true,false)&input=Q2o0S0lXTm9ZWEpuWlMxdFlYaEViM1ZpYkdWV1pYSnBabmxDY21GdVpGTmhabVYwZVNJWkNOdl9fX19fX19fX193RVNER1J2ZFdKc1pYWmxjbWxtZVFvN0NoMWphR0Z5WjJVdFlXeHNWRlJFUTNWemRHOXRRMjl1ZEdWNGRIVmhiQ0lhQ05yX19fX19fX19fX3dFU0RYUjBaR052Ym5SbGVIUjFZV3dLUHdvaVkyaGhjbWRsTFdGc2JFUnZkV0pzWlZabGNtbG1lVUp2ZEVGMmIybGtZVzVqWlNJWkNPZl9fX19fX19fX193RVNER1J2ZFdKc1pYWmxjbWxtZVFvd0NneGphR0Z5WjJVdFlXeHNMVEVpSUFqX19fX19fX19fX184QkVoTjBkR1JmWkdGMFlWOWxlR05zZFhOcGIyNXo

As a third example, IBM ads were observed being served to bots operating out of data centers. Many of the IBM ads observed being served to bots in data centers contained Javascript from doubleverify.com and source code from a demand side platform, which included Base64 encoded text.

For example, the IBM ad served was served to a bot that was crawling the website gyytfguy654rffyu7655rrfhhu776444rfgghu7755rrggy7stmmmwwewe5t.hamidstm-stm.workers.dev (source: https://urlscan.io/result/65643690-293b-40e7-bba8-029588d7a677/). The URLScan.io bot was operating out of a data center when it was served the IBM ad. The ad was transacted by Trade Desk and Microsoft Xandr SSP.

URLScan.io bot screenshot generated when the bot was operating out of a data center and crawling gyytfguy654rffyu7655rrfhhu776444rfgghu7755rrggy7stmmmwwewe5t.hamidstm-stm.workers.dev. The bot was served an IBM ad by the Trade Desk and Microsoft Xandr SSP. The source code of the IBM ad includes base64 encoded references to “charge-allDoubleVerifyBotAvoidance”

Screenshot of URLScan.io, showing a POST HTTP request was made to ib.adnxs.com/ut/v3, a Prebid header bidding endpoint operated by Microsoft Xandr (f/k/a AppNexus).

One can inspect the HTTP response from Microsoft Xandr’s server to see that the bid response that was transacted by Microsoft Xandr supply side platform (which owns the endpoint ib.adnxs.com/ut/v3/prebid) and by the demand side platform Trade Desk.

Screenshots of source code of an HTTP bid response from OpenX’s prebid endpoint, when serving a IBM ad to a bot in a data center. Source: https://urlscan.io/responses/db9ff1ce619c22da747c24b23ad2dabb45c0f7a8661d5e88a44022c8ca3521a6/

One can carefully analyze the Trade Desk demand side platform win notification pixel that was triggered when the IBM ad was served to the bot operating out of the data center whilst crawling the website gyytfguy654rffyu7655rrfhhu776444rfgghu7755rrggy7stmmmwwewe5t.hamidstm-stm.workers.dev.

The IBM ad includes a DSP win notification pixel with the “dur=” query string parameter.

The value of the “dur=” query string parameter from the US Navy DSP win notification pixel is a base64 encoded string: “CiMKDmNoYXJnZS1hbGwtMTIyIhEIhv__________ARIEaWF2Mgo-CiFjaGFyZ2UtbWF4RG91YmxlVmVyaWZ5QnJhbmRTYWZldHkiGQjb__________8BEgxkb3VibGV2ZXJpZnkKPgohY2hhcmdlLWFsbERvdWJsZVZlcmlmeVZpZXdhYmlsaXR5IhkI8P__________ARIMZG91YmxldmVyaWZ5Cj8KImNoYXJnZS1hbGxEb3VibGVWZXJpZnlCb3RBdm9pZGFuY2UiGQjn__________8BEgxkb3VibGV2ZXJpZnkKTwowY2hhcmdlLWFsbERvdWJsZVZlcmlmeURpc3BsYXlWaWV3YWJpbGl0eVRyYWNraW5nIhsIt___________ARIMZHYtcmVwb3J0aW5nKgA”.

If one copies that specific “dur=” query string parameter and opens the website cyberchef.org (an online web utility or tool provided by the UK signal intelligence agency GCHQ that helps Base64 decode text), one can see the decoded value of this specific “dur=” query string parameter. cyberchef.org shows that the decoded value of the “dur=” query string parameter from the Ikea ad contains references to “charge-allDoubleVerifyBotAvoidance”.

Source: https://cyberchef.org/#recipe=From_Base64('A-Za-z0-9%2B/%3D',false,false)&input=Q2lNS0RtTm9ZWEpuWlMxaGJHd3RNVEl5SWhFSWh2X19fX19fX19fX0FSSUVhV0YyTWdvLUNpRmphR0Z5WjJVdGJXRjRSRzkxWW14bFZtVnlhV1o1UW5KaGJtUlRZV1psZEhraUdRamJfX19fX19fX19fOEJFZ3hrYjNWaWJHVjJaWEpwWm5rS1Bnb2hZMmhoY21kbExXRnNiRVJ2ZFdKc1pWWmxjbWxtZVZacFpYZGhZbWxzYVhSNUloa0k4UF9fX19fX19fX19BUklNWkc5MVlteGxkbVZ5YVdaNUNqOEtJbU5vWVhKblpTMWhiR3hFYjNWaWJHVldaWEpwWm5sQ2IzUkJkbTlwWkdGdVkyVWlHUWpuX19fX19fX19fXzhCRWd4a2IzVmliR1YyWlhKcFpua0tUd293WTJoaGNtZGxMV0ZzYkVSdmRXSnNaVlpsY21sbWVVUnBjM0JzWVhsV2FXVjNZV0pwYkdsMGVWUnlZV05yYVc1bkloc0l0X19fX19fX19fX19BUklNWkhZdGNtVndiM0owYVc1bktnQQ

As a fourth example, Singaporean Government ads were observed being served to bots operating out of data centers. Many of the Singaporean Government ads observed being served to bots in data centers contained source code from a demand side platform, which included Base64 encoded text.

For example, a Singaporean Government ad was served to a bot that was crawling the website ihwal.id. (source: https://urlscan.io/result/50a3059f-ec17-4135-ba2b-563de4bc9b9a/). The URLScan.io bot was operating out of a data center when it was served the Singaporean Government ad. The ad was transacted by Trade Desk and Google AdX.

One can inspect the HTTP response from Google’s server to see that the bid response was transacted by Google AdX (which owns the endpoint googleads.g.doubleclick.net) and by the demand side platform Trade Desk.

Screenshots of source code of an HTTP bid response from Google AdX, when serving a Singaporean Government ad to a bot in a data center. Source: https://urlscan.io/responses/0bb403b5e03b91df5dd7625c03bc643c89970a32270ac44e995b88f84bdea8da/

One can carefully analyze the Trade Desk demand side platform win notification pixel that was triggered when the Singaporean Government ad was served to the bot operating out of the data center whilst crawling the website ihwal.id.

Source: https://urlscan.io/result/50a3059f-ec17-4135-ba2b-563de4bc9b9a/#transactions

The Singaporean Government ad includes a DSP win notification pixel with the “dur=” query string parameter.

The value of the “dur=” query string parameter from the US Navy DSP win notification pixel is a base64 encoded string: “CjgKHmNoYXJnZS1hbGxHcmFwZXNob3RCcmFuZFNhZmV0eSIWCPb__________wESCWdyYXBlc2hvdAo-CiFjaGFyZ2UtbWF4RG91YmxlVmVyaWZ5QnJhbmRTYWZldHkiGQjx__________8BEgxkb3VibGV2ZXJpZnkKPgohY2hhcmdlLWFsbERvdWJsZVZlcmlmeVZpZXdhYmlsaXR5IhkI8P__________ARIMZG91YmxldmVyaWZ5Cj8KImNoYXJnZS1hbGxEb3VibGVWZXJpZnlCb3RBdm9pZGFuY2UiGQjn__________8BEgxkb3VibGV2ZXJpZnkKNwodY2hhcmdlLW1heEdyYXBlc2hvdENhdGVnb3JpZXMiFgj3__________8BEglncmFwZXNob3Q”.

If one copies that specific “dur=” query string parameter and opens the website cyberchef.org (an online web utility or tool that helps Base64 decode text), one can see the decoded value of this specific “dur=” query string parameter. cyberchef.org shows that the decoded value of the “dur=” query string parameter from the Ikea ad contains references to “charge-allDoubleVerifyBotAvoidance”.

Source: https://cyberchef.org/#recipe=From_Base64('A-Za-z0-9%2B/%3D',false,false)&input=Q2pnS0htTm9ZWEpuWlMxaGJHeEhjbUZ3WlhOb2IzUkNjbUZ1WkZOaFptVjBlU0lXQ1BiX19fX19fX19fX3dFU0NXZHlZWEJsYzJodmRBby1DaUZqYUdGeVoyVXRiV0Y0Ukc5MVlteGxWbVZ5YVdaNVFuSmhibVJUWVdabGRIa2lHUWp4X19fX19fX19fXzhCRWd4a2IzVmliR1YyWlhKcFpua0tQZ29oWTJoaGNtZGxMV0ZzYkVSdmRXSnNaVlpsY21sbWVWWnBaWGRoWW1sc2FYUjVJaGtJOFBfX19fX19fX19fQVJJTVpHOTFZbXhsZG1WeWFXWjVDajhLSW1Ob1lYSm5aUzFoYkd4RWIzVmliR1ZXWlhKcFpubENiM1JCZG05cFpHRnVZMlVpR1Fqbl9fX19fX19fX184QkVneGtiM1ZpYkdWMlpYSnBabmtLTndvZFkyaGhjbWRsTFcxaGVFZHlZWEJsYzJodmRFTmhkR1ZuYjNKcFpYTWlGZ2ozX19fX19fX19fXzhCRWdsbmNtRndaWE5vYjNR

Research Results

Research Results: IAS’s publisher services pixel appears to label bots as valid human traffic in certain occasions

IAS’s public “Publisher Optimization Implementation Guide” explains that the “The desktop and mobile web environment uses a JavaScript library. Publisher Optimization returns JSON with the response data as key/valu Jo e targeting parameters.” This is loaded via a specific Javascript file called: “static.adsafeprotected.com/iasPET.1.js”.

Screenshot of Integral Ad Science (IAS)’s public “Publisher Optimization Implementation Guide” - Original Link: https://helpcenter.integralplatform.com/article/publisher-optimization-implementation-guide; Archived: https://perma.cc/G6EZ-76VR;

The IAS Publisher Optimization tool “provides key/values for: [...] whether the current ad represents invalid traffic (IVT).”

Screenshot of Integral Ad Science (IAS)’s public “Publisher Optimization Implementation Guide” - Original Link: https://helpcenter.integralplatform.com/article/publisher-optimization-implementation-guide; Archived: https://perma.cc/G6EZ-76VR;

IAS’s “publisher optimization implementation guide” reveals that the vendor uses the “fr” key to indicate “IAS Invalid Traffic”. Possible values for the “fr” key are “fr=true” or “fr=false. The guide explains that this indicates “Whether the ad represents invalid traffic (for example, the visitor is a bot rather than a human).”

Screenshot of Integral Ad Science (IAS)’s public “Publisher Optimization Implementation Guide” - Original Link: https://helpcenter.integralplatform.com/article/publisher-optimization-implementation-guide; Archived: https://perma.cc/G6EZ-76VR;

Given a sample dataset of 100% confirmed bot traffic as a ground truth baseline, one can analyze within thesample how often IAS’ publisher optimization tool returns a value of “fr=true” (bot) or “fr=false” (not bot).

For example, for a set of bot crawls, one can see whether IAS’s “pixel.adsafeprotected.com” endpoint returned a value of “fr=true” or “fr=false”.

In the screenshot below, one can see an instance where URLScan.io’s bot crawled thetimes.com on October 20th, 2024. One can see that, when the bot was crawling thetimes.com, IAS’s server endpoint “pixel.adsafeprotected.com/services/pub” was invoked. That HTTP endpoint returned a JSON response, with the “fr” key value set to “false”, as in “fr=false”.

Composite screenshot of URLScan.io’s bot crawling thetimes.com on October 20th, 2024. The screenshot shows HTTPS requests to “pixel.adsafeprotected.com”, which elicited an HTTPS response with “fr”: “false”.

As a second example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled wired.com on September 10th, 2024. One can see that, when the bot was crawling wired.com, IAS’s server endpoint “pixel.adsafeprotected.com/services/pub” was invoked. That HTTP endpoint returned a JSON response, with the “fr” key value set to “false”, as in “fr=false”. This suggests that according to IAS's adjudication, the URLScan.io bot does not constitute invalid traffic, and is therefore being classified as valid, human traffic.

Composite screenshot of URLScan.io’s bot crawling wired.com on September 10th, 2024. The screenshot shows HTTPS requests to “pixel.adsafeprotected.com”, which elicited an HTTPS response with “fr”: “false”.

As a third example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled wsj.com on November 18th, 2024. One can see that, when the bot was crawling wsj.com, IAS’s server endpoint “pixel.adsafeprotected.com/services/pub” was invoked. That HTTP endpoint returned a JSON response, with the “fr” key value set to “false”, as in “fr=false”. This suggests that according to IAS's adjudication, the URLScan.io bot does not constitute invalid traffic, and is therefore being classified as valid, human traffic.

Composite screenshot of URLScan.io’s bot crawling wsj.com on November 18th, 2024. The screenshot shows HTTPS requests to “pixel.adsafeprotected.com”, which elicited an HTTPS response with “fr”: “false”.

As a fourth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled washingtonpost.com on September 2nd, 2024. One can see that, when the bot was crawling washingtonpost.com, IAS’s server endpoint “pixel.adsafeprotected.com/services/pub” was invoked. That HTTP endpoint returned a JSON response, with the “fr” key value set to “false”, as in “fr=false”. This suggests that according to IAS's adjudication, the URLScan.io bot does not constitute invalid traffic, and is therefore being classified as valid, human traffic.

Composite screenshot of URLScan.io’s bot crawling washingtonpost.com on Sept. 2nd, 2024. The screenshot shows HTTPS requests to “pixel.adsafeprotected.com”, which elicited an HTTPS response with “fr”: “false”.

As a fifth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled cnn.com on November 12th, 2024. One can see that, when the bot was crawling cnn.com, IAS’s server endpoint “pixel.adsafeprotected.com/services/pub” was invoked. That HTTP endpoint returned a JSON response, with the “fr” key value set to “false”, as in “fr=false”. This suggests that according to IAS's adjudication, the URLScan.io bot does not constitute invalid traffic, and is therefore being classified as valid, human traffic.

Composite screenshot of URLScan.io’s bot crawling cnn.com on Nov 12th, 2024. The screenshot shows HTTPS requests to “pixel.adsafeprotected.com”, which elicited an HTTPS response with “fr”: “false”.

As a sixth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled forbes.com on November 6th, 2024. One can see that, when the bot was crawling forbes.com, IAS’s server endpoint “pixel.adsafeprotected.com/services/pub” was invoked. That HTTP endpoint returned a JSON response, with the “fr” key value set to “false”, as in “fr=false”. This suggests that according to IAS's adjudication, the URLScan.io bot does not constitute invalid traffic, and is therefore being classified as valid, human traffic. Forbes also appears to have installed the DoubleVerify publisher optimization tool on its web pages.

Composite screenshot of URLScan.io’s bot crawling forbes.com on November 6th, 2024. The screenshot shows HTTPS requests to “pixel.adsafeprotected.com”, which elicited an HTTPS response with “fr”: “false”. Forbes also appears to use DoubleVerify’s publisher optimization tools on its pages.

As a seventh example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled barrons.com on November 19th, 2024. One can see that, when the bot was crawling barrons.com, IAS’s server endpoint “pixel.adsafeprotected.com/services/pub” was invoked. That HTTP endpoint returned a JSON response, with the “fr” key value set to “false”, as in “fr=false”. This suggests that according to IAS's adjudication, the URLScan.io bot does not constitute invalid traffic, and is therefore being classified as valid, human traffic.

Composite screenshot of URLScan.io’s bot crawling forbes.com on November 19th, 2024. The screenshot shows HTTPS requests to “pixel.adsafeprotected.com”, which elicited an HTTPS response with “fr”: “false”, suggesting that IAS classified the given request as valid, human traffic. It appears that barrons.com served two potentially direct sold (insertion order) ads for Google Cloud to the bot. Source: https://urlscan.io/result/03993bbd-0028-4fc1-a79e-8c04df78e808/

As explained earlier, Urlscan.io is a web-based security tool and public sandbox that analyzes websites by automating the process of visiting URLs with a bot to capture detailed metadata, including HTTP requests, DNS lookups, and visual snapshots, which can be used for threat analysis and research. Urlscan.io automates checking websites with a headless browser bot.

Based on a sample of URLScan.io crawls from 2019 to 2024 (5+ years) across over 1,200 distinct websites in North America, Europe, Japan, Singapore, Australia, and other countries, it is possible to see how often IAS’ Publisher Optimization tool labeled URLScan.io’s bot as “fr=true” (meaning, invalid traffic or a bot), versus “fr=false” (meaning, valid traffic or a human).

This dataset was compiled from instances where the bot crawled many different websites, including wsj.com, nfl.com, weather.com, cnn.com, forbes.com, americanbanker.com, tripadvisor.com, genius.com, reuters.com, bloomberg.com, washingtonpost.com, businessinsider.com, usatoday.com, wired.com, newyorker.com, vogue.com, vanityfair.com, lefigaro.fr, and others.

In some cases, IAS’s Publisher Optimization tool correctly identified URLScan.io’s bot as a bot, and labeled it as “fr=true” (meaning it’s a bot). In many cases however, URLScan.io’s bot crawling a page was labeled as “fr=false”.

Across the sample of URLScan.io crawls from 2019-2024, IAS’s Publisher Optimization tool labeled the URLScan.io bot as “fr=false” (valid traffic or human) 77% of the time.

One point that merits mentioning is that in some cases, it appears that IAS’ publisher optimization tool labeled the same bot and page view session as both valid, human traffic (“fr=false”) and simultaneously as invalid, bot traffic (“fr=true”).

For example, on March 17th, 2022, URLScan.io’s bot crawled the Swiss website 20minuti.ch. The IAS Publisher Optimization endpoint was invoked fifty seven (57x) times whilst the URLScan.io bot was crawling this page on March 17th. Forty (40x) of the HTTPS responses from the “pixel.adsafeprotected/services/pub” endpoint indicated the URLScan.io bot was classified as invalid, bot traffic (“fr=true”), whilst seventeen (17x) of the HTTPS responses from the IAS endpoint indicated that the URLScan.io bot was classified as valid, human traffic (“fr=false”). It is not immediately apparent from IAS’ public documentation how situations where a bot is labeled as both valid, human traffic and invalid bot traffic are dispositioned. The URLScan.io crawl record from March 17th, 2022 on 20minuti.ch can be found here: https://urlscan.io/result/be50408e-393b-451d-b571-2b85875c2c5c.

As a second example, on May 21st, 2024, URLScan.io’s bot crawled reuters.com. The IAS Publisher Optimization endpoint was invoked seventeen (17x) times whilst the URLScan.io bot was crawling this Reuters page on May 21st, 2024. Nine (9x) of the HTTPS responses from the “pixel.adsafeprotected/services/pub” endpoint indicated the URLScan.io bot was classified as invalid, bot traffic (“fr=true”), whilst eight (8x) of the HTTPS responses from the IAS endpoint indicated that the URLScan.io bot was classified as valid, human traffic (“fr=false”). It is not immediately apparent from IAS’ public documentation how situations where a bot is labeled as both valid, human traffic and invalid bot traffic are resolved. The URLScan.io crawl record from May 21st, 2024 on reuters.com can be found here: https://urlscan.io/result/58d84a13-e69e-402e-848e-e01e807c4849 .

As a third example, on July 19th, 2022, URLScan.io’s bot crawled theguardian.com. The IAS Publisher Optimization endpoint was invoked fifteen (15x) times whilst the URLScan.io bot was crawling this The Guardian page on July 19th, 2022. Five (5x) of the HTTPS responses from the “pixel.adsafeprotected/services/pub” endpoint indicated the URLScan.io bot was classified as invalid, bot traffic (“fr=true”), whilst ten (10x) of the HTTPS responses from the IAS endpoint indicated that the URLScan.io bot was classified as valid, human traffic (“fr=false”). The URLScan.io crawl record from July 19th, 2022 on theguardian.com can be found here: https://urlscan.io/result/5b63ab93-4772-4aec-9ac9-5d9611783796.

As a fourth example, on June 19th, 2024, URLScan.io’s bot crawled the Mexican-American Spanish language sports website tudn.com (owned by TelevisaUnivision). The IAS Publisher Optimization endpoint was invoked eight (8x) times whilst the URLScan.io bot was crawling this TUDN page on June 19th, 2024. Five (5x) of the HTTPS responses from the “pixel.adsafeprotected/services/pub” endpoint indicated the URLScan.io bot was classified as invalid, bot traffic (“fr=true”), whilst three (3x) of the HTTPS responses from the IAS endpoint indicated that the URLScan.io bot was classified as valid, human traffic (“fr=false”). he URLScan.io crawl record from June 19th, 2024 on tudn.com can be found here: https://urlscan.io/result/29029e06-6199-43d6-b63b-9d80cfe209bb.

As a fifth example, on October 4th, 2023, URLScan.io’s bot crawled the Italian sports page sport.sky.it. The IAS Publisher Optimization endpoint was invoked eight (8x) times whilst the URLScan.io bot was crawling this sport.sky.it page on October 4th, 2023. Three (3x) of the HTTPS responses from the “pixel.adsafeprotected/services/pub” endpoint indicated the URLScan.io bot was classified as invalid, bot traffic (“fr=true”), whilst five (5x) of the HTTPS responses from the IAS endpoint indicated that the URLScan.io bot was classified as valid, human traffic (“fr=false”). The URLScan.io crawl record from October 4th, 2023 on sport.sky.it can be found here: https://urlscan.io/result/06a35f50-ba0e-480c-afef-051b6ae3de91.

As a sixth example, on August 1st, 2023, URLScan.io’s bot crawled wired.com (owned by Conde Nast). The IAS Publisher Optimization endpoint was invoked seven (7x) times whilst the URLScan.io bot was crawling this wired.com page on August 1st, 2023. Two (2x) of the HTTPS responses from the “pixel.adsafeprotected/services/pub” endpoint indicated the URLScan.io bot was classified as invalid, bot traffic (“fr=true”), whilst five (5x) of the HTTPS responses from the IAS endpoint indicated that the URLScan.io bot was classified as valid, human traffic (“fr=false”). The URLScan.io crawl record from August 1st, 2023 on wired.com can be found here: https://urlscan.io/result/4655143b-3169-4d68-b309-6e5d1cf57cc7.

Below is a summary table of the above examples..

Table illustrating examples where IAS’ publisher optimization tool (‘pixel.adsafeprotected.com/services/pub’) returned multiple distinct “fr” classifications for the same URLScan.io bot

In addition to evaluating how IAS’ publisher optimization tool classified URLScan.io’s bots from 2019-2024, this study also examined how IAS classified HTTP Archive bots from 2022-2024. As mentioned earlier, HTTP Archive’s bots openly declared themselves to be bots via the HTTP “User-Agent” request header. The HTTP Archive bot uses a User Agent string that has been on the IAB Tech Lab’s Bots & Spiders list since 2013. This list is an industry resource compiled by the IAB Tech Lab, and various trade groups and accreditation bodies suggest that ad tech vendors utilize this reference list as a means of detecting and/or filtering out declared bot traffic. Furthermore, HTTP Archive’s bots crawl the open from a set of publicly well known data center IP addresses.

Based on a sample of HTTP Archive crawls from 2022 to 2024 across over many distinct websites, it is possible to see how often IAS’ Publisher Optimization tool labeled HTTP Archive’s declared bot as “fr=true” (meaning, invalid traffic or a bot), versus “fr=false” (meaning, valid traffic or a human).

This dataset was compiled from instances where the declared HTTP Archive bot crawled many different websites, including fandom.com, usatoday.com, sky.it, cnn.com, wired.com, washingtonpost.com, weather.com, and others.

In some cases, IAS’s Publisher Optimization tool correctly identified HTTP Archive’s declared bot as a bot, and labeled it as “fr=true” (meaning its a bot and invalid traffic). In many cases however, URLScan.io’s bot crawling a page was labeled as “fr=false” (valid, human traffic).

Across the sample of HTTP Archive crawls from 2022-2024, IAS’s Publisher Optimization tool labeled the declared HTTP Archive bot (whose User Agent string has been on the IAB Tech Labs Bots & Spiders list since 2013) as “fr=false” (valid traffic or human) 16% of the time.

For example, the HTTP Archive bot (which operates out of well known data center IP addresses and whose user agent has been on the IAB Tech Labs Bots & Spiders list since 2013) crawled washingtonpost.com every month from January 2022 to August 2024. The declared bot was classified as valid, human traffic (“fr=false”) in March, April, May, July, August, and December of 2022, February, March, April, May, June, July, August, October, November, and December of 2023, and February, March, April, May, July, and August of 2024.

Research Results: Publishers which appear to partner with IAS and DoubleVerify and were seen serving ads to declared bots in data centers

Many publishers which appear to employ the IAS and DoubleVerify publisher optimization tools on their pages were observed serving ads to bots, including to declared bots operating out of known data center IP addresses, whose user agent has been on the IAB Tech Labs Bots & Spiders list since 2013.

Screenshot of DoubleVerify’s website about its publisher optimization tool. Source: https://pub.doubleverify.com/

In some cases, even when one of the vendor's publisher optimization tools adjudicates that the given website visitor is a bot or invalid, fraudulent traffic, there are still ads being delivered to the bot. Some of those ads delivered to a bot after the bot was adjudicated to be bot traffic include references to IAS and DoubleVerify.

The publisher Trip Advisor appears to employ IAS’ publisher optimization tool on its web pages. On April 23, 2023, URLScan.io’s bot visited tripadvisor.com.sg, and the IAS publisher optimization tool (via endpoint “pixel.adsafeprotected.com/services/pub”) was invoked by the publisher. The IAS API endpoint returned the classification “fr=true”, indicating invalid, bot traffic. An ad for Air New Zealand was served to the bot by Trade Desk and Google. The source code of the Air New Zealand ad that was served to a bot after IAS’s tool labeled the page view as “fr=true” indicated “charge-allIntegralSuspiciousActivity”.

Screenshot of IAS technical documentation, describing IAS “Suspicious Activity” Pre-Bid Segments; Source: https://go.integralads.com/rs/469-VBI-606/images/IAS_Xandr_User_Guide.pdf; Archived: https://perma.cc/93NM-2S2Q

April 23, 2023, URLScan.io’s bot visited tripadvisor.com.sg, and the IAS publisher optimization tool (via endpoint “pixel.adsafeprotected.com/services/pub”) was invoked by the publisher. The IAS API endpoint returned the classification “fr=true”, indicating invalid, bot traffic. An ad for Air New Zealand was served to the bot by Trade Desk and Google. The source code of the Air New Zealand ad that was served to a bot after IAS’s tool labeled the page view as “fr=true” indicated “charge-allIntegralSuspiciousActivity”. Source: https://urlscan.io/result/ec0ecda0-2c0f-436e-aa63-5836aea9bb3d/

As another example, on February 16th, 2023, URLScan.io’s bot visited quiltville.blogspot.com, and the IAS publisher optimization tool (via endpoint “pixel.adsafeprotected.com/services/pub”) was invoked. The IAS API endpoint returned the classification “fr=true”, indicating invalid, bot traffic. An ad for Grand Ole Opry music stage in Nashville, Tennessee was served to the bot by Trade Desk and Magnite. The source code of the Opry ad that was served to a bot after IAS’s tool labeled the page view as “fr=true” indicated “charge-allIntegralSuspiciousActivity”.

On February 16th, 2023, URLScan.io’s bot visited quiltville.blogspot.com, and the IAS publisher optimization tool (via endpoint “pixel.adsafeprotected.com/services/pub”) was invoked.. The IAS API endpoint returned the classification “fr=true”, indicating invalid, bot traffic. An ad for Grande Ole Opry music stage in Nashville, Tennessee was served to the bot by Trade Desk and Magnite. The source code of the Opry ad that was served to a bot after IAS’s tool labeled the page view as “fr=true” indicated “charge-allIntegralSuspiciousActivity”. Source: https://urlscan.io/result/38691aec-a650-4ff6-a05c-9065fd72c938/#summary

For example, the website fandom.com appears to utilize pre-auction publisher optimization code from both IAS and DoubleVerify. Fandom.com’s media kit lists both vendors as “partners”.

Source: Fandom media kit - https://perma.cc/DCP2-FASY

URLScan.io’s bot crawled icehockey.fandom.com on December 24th, 2023 with no pre-existing user IDs from a data center in Australia. One can see that DoubleVerify and IAS’ publisher optimization code was invoked whilst the bot was crawling the fandom.com page.

Screenshot showing an HTTP request to pub.doubleverify.com/signals/pub.json and pixel.adsafeprotected.com/services/pub made on fandom.com whilst URLScan.io’s bot was crawling the page. Source: https://urlscan.io/result/807e8b7d-0711-4a7d-920d-489ab88019b4/#transactions

The bot was served an ad from BioLife Plasma Services, transacted via Trade Desk and Microsoft Xandr SSP. The source code of the BioLife Plasma ad includes references to and code from DoubleVerify.

Screenshot of a BioLife Plasma Services ad served to URLScan.io’s bot whilst the bot was crawling fandom.com. Source: https://urlscan.io/result/807e8b7d-0711-4a7d-920d-489ab88019b4/#summary

As another example, URLScan.io’s bot crawled blackclover.fandom.com on April 2nd, 2024 with no pre-existing user IDs. One can see that DoubleVerify and IAS’ publisher optimization code was invoked whilst the bot was crawling the fandom.com page. The bot was then served an ad for Progressive insurance by Trade Desk and Index Exchange. The source code of the Progressive insurance ads includes references to “charge-allScibids”.

Screenshot of URLScan.io’s bot crawling blackclover.fandom.com on April 2nd, 2024 whilst being shown a Progressive insurance ad that contains references to “charge-allScibids” and “adbrain”. The publisher fandom.com appears to have configured the IAS and DoubleVerify publisher optimization tools on its website, and lists IAS and DoubleVerify as “partners” in its media kit for prospective advertisers. Source: https://urlscan.io/result/54af4acf-1183-449d-a939-a79ff4c2df6c/

As another example, URLScan.io’s bot crawled nextflix.fandom.com on December 7th, 2023. Fandom.com’s page invoked both the IAS and DoubleVerify publisher optimization tools. The IAS publisher optimization tool returned the classification “fr=true”, indicating invalid, bot traffic. An ad for American Express was served to the bot by Trade Desk and Microsoft Xandr. The American Express ad included references to IAS.

Screenshot of an American Express ad served to URLScan.io’s bot. DoubleVerify and IAS’ publisher optimization tools were invoked by the publisher fandom.com. The IAS publisher optimization tool labeled the session as “fr=true”, indicating invalid, bot traffic. Source: https://urlscan.io/result/d4420660-2fb8-462f-ac45-e90a771328b2/#summary

As another example, URLScan.io’s bot crawled timeout.com on May 20th, 2024. The publisher timeout.com invoked the DoubleVerify publisher tool by making an HTTPS request to “pub.doubleverify.com/signals/pub.json”. Afterwards, an ad for the National Football League (NFL) was served to the bot via Trade Desk and ShareThrough SSP. The source code of the NFL ad appeared to contain code from DoubleVerify and references to “charge-allDoubleVerifyBotAvoidance”.

As another example. United States Navy ads were served to URLScan.io’s bot whilst the bot was crawling fandom.com, which uses both IAS and DoubleVerify publisher optimization tools on its web pages.

The list of publishers who appear to utilize DoubleVerify’s publisher optimization tools and appear to have been serving ads on behalf of Fortune 500, United States Government, or other advertisers to bots includes: ​​forbes.com, bbc.com, telegraph.co.uk, bloomberg.com, weather.com, businessinsider.com, independent.co.uk, cnbc.com, globo.com, time.com, healthline.com, cnet.com, mayoclinic.org, buzzfeed.com, newsweek.com, webmd.com, sapo.pt, theverge.com, nbcnews.com, fandom.com, elpais.com, theatlantic.com, huffpost.com, realtor.com, economist.com, infobae.com, sportskeeda.com, goal.com, clarin.com, n-tv.de, target.com, zdnet.com, mashable.com, fastcompany.com, medicalnewstoday.com, futbin.com, g1.globo.com, kooora.com, arca.live, vox.com, ge.globo.com, inc.com, ign.com, bizjournals.com, pcmag.com, euronews.com, today.com, thedailybeast.com, lifehacker.com, and many others.

The list of publishers who appear to utilize Integral Ad Science’s (IAS) publisher optimization tools and appear to have been serving ads on behalf of Fortune 500, United States Government, or other advertisers to bots includes: forbes.com, cnn.com, theguardian.com, fandom.com, washingtonpost.com, reuters.com, bloomberg.com, wsj.com, weather.com, businessinsider.com, independent.co.uk, tripadvisor.com, usatoday.com, wired.com, healthline.com, nypost.com, webmd.com, investopedia.com, bestbuy.com, people.com, huffpost.com, news.com.au, lefigaro.fr, bild.de, genius.com, infobae.com, marketwatch.com, newyorker.com, clarin.com, n-tv.de, lanacion.com.ar, vice.com, mashable.com, excite.co.jp, mundodeportivo.com, medicalnewstoday.com, fanpage.it, tudogostoso.com.br, aljazeera.com, nba.com, gizmodo.com, thesun.co.uk, arstechnica.com, entrepreneur.com, variety.com, livescience.com, smh.com.au, techradar.com, qz.com, and many others.

Research Results: Ad tech vendors who claim to partner with HUMAN Security that were observed serving ads to bots

HUMAN Security (f/k/a) White Ops is a cybersecurity company, which offers products and services such as MediaGuard. “White Ops MediaGuard combines real-time preventive technology with the global footprint of evidence-based intelligence provided by White Ops FraudSensor™ to give an accurate, preemptive indication of whether a human is on the other end of a request for an ad or webpage. This low latency blocking service is designed to accommodate real time bidding scenarios, providing ad platforms the ability to prevent an ad from ever being served to a bot, and ensuring high-humanity levels in the audience bid stream” (emphasis added).

Source: HUMAN Security website

Which demand side platforms (DSPs) were observed transacting the highest relative number of ad impressions to bots in Oct 2024?

A preliminary research question was to determine which demand side platforms (DSPs) or ad software used for purchasing ads by media buyers appeared to transact the highest number of ad impressions to bots in a specific time frame.

To approach this question, data from URLScan.io and from HTTP Archive was analyzed for the month of October 2024.

URLScan.io and HTTP Archive’s crawl dataset records were queried for a number of demand side platform “selectors”. A “selector” is a term of art commonly used by signals intelligence (SIGINT) professionals working at the National Security Agency (NSA), Government Communications Headquarters (GCHQ), or Unit 8200 of the Israel Defense Forces.

According to the Director of National Intelligence (ODNI), a “selector” is a “unique identifier” associated with a target of interest. For example, a telephone number or email address can be a selector used to parse large amounts of raw signals intelligence data when trying to track a specific target as part of a foreign intelligence operation.

In this present context, “selector” refers to an identifier that can unambiguously confirm that a specific ad tech provider transacted a given ad. For example, ads transacted to a bot by the Trade Desk DSP contain win notification pixels from endpoints such as “vam-bid.adsrvr.org/bid/feedback/”. Ads transacted to a bot by Freewheel Beeswax DSP contain a win notification pixels from endpoints such as “us-east-1.event.prod.bidr.io/log/imp/”. Ads transacted to a bot by Yahoo DSP contain win notification pixels such as: “pn.ybp.yahoo.com/ab/secure/true/imp/”.

Data from URLScan.io and from HTTP Archive’s bot crawl session from October 2024 was filtered and counted for the presence of various DSP specific selectors.

Eight (8) demand side platforms (DSPs) were analyzed during the course of this exploratory analysis, including:

  1. Google Display & Video 360 (DV360)

  2. Amazon DSP

  3. Trade Desk DSP

  4. Yahoo DSP

  5. Epsilon ad buying platform (distinct from Epsilon Conversant SSP)

  6. Comcast Freewheel Beeswax

  7. AdTheorent

  8. Basis Technologies (f/k/a Centro)

Based on analysis of URLScan.io data from October 2024, it appears that DV360 transacted the highest number of ad impressions to URLScan.io’s bots, as determined by the presence of DV360 related ad selectors. Amazon DSP transacted the second highest number of ads to URLScan.io’s bot, and Trade Desk transacted the third highest number of ads to URLScan.io’s bots in the October 2024 sample.

It bears mentioning that this sample was not normalized or adjusted by market adoption or utilization rates of each DSP. The fact that DV360 or Amazon were observed serving such a relatively high number of ads to bots when compared to other DSPs such as Yahoo DSP may simply be a function of DV360 or Amazon having more market adoption and transacting a higher volume of ad impressions each month than Yahoo DSP. Readers are strongly encouraged to be careful when analyzing this sample of data and considering various artifacts which may have affected the sample results

It bears mentioning that two of the DSPs or ad bidding platforms references above - Google DV360 and Trade Desk - claim to partner with HUMAN Security for invalid traffic (IVT) detection. HUMAN Security possesses various Media Rating Council (MRC) certifications related to bot filtration as well as being TAG Certified Against Fraud.

In addition to examining a sample of data from URLScan.io, data from the HTTP Archive bot was also examined to see how many ads were served to this second bot by various ad tech vendors.

It bears mentioning that HTTP Archive’s bot is a declared bot whose user agent has been on the IAB Tech Lab’s Bots & Spiders List since 2013. Furthermore, the HTTP Archive bot crawls the web from known Google Cloud data center IPs, some of which may be on industry data center lists.

The HTTP Archive data was similarly analyzed for October 2024. The data was filtered and counted for the presence of various DSP selectors to determine how many ads a given vendor had transacted with HTTP Archive’s bot.

Some ad tech vendors, such as Ad Theorent, Basis Technologies, and Epsilon (demand side) were not observed transacting any ad impressions to HTTP Archive’s declared bot.

Google DV360 was observed serving far more ads than any other vendor in this sample. This may be counter-intuitive, as Google presumably has full awareness of which IP addresses are known Google Cloud data center IPs and might be able to easily filter out bot traffic from its own data centers from being served digital ads by its platforms.

Google DV360 was observed transacting about ~15x times as many ads to a declared bot operating out of a Google data center as the Trade Desk. Both Google DV360 and Trade Desk claim to partner with HUMAN Security for bot filtering. It bears noting that even though the two DSPs that have issued public statements about claiming to partner with HUMAN Security were observed serving the highest number of ads to a declared bot on the IAB Bots & Spiders List operating out of a known data center IP, this may be a function of how much market share these two DSPs have relative to other DSPs. This analysis does not “normalize” or account for relative market share or total volume of impressions each DSP transacted each month.

As mentioned above, it is worth noting that some demand side platforms - such as Basis Technologies (f/k/a as Centro) - were not observed transacting any ad impressions to HTTP Archive’s bot in October 2024.

Prompted by these observations, Adalytics reached out to some of the demand side platforms (DSPs) who appear to be filtering out declared bot traffic. Adalytics asked them for comments about how and why they are filtering out declared bot traffic, or preventing ads from serving to declared bots such as the HTTP Archive bot.

Basis Technologies (f/k/a Centro) agreed to provide comments “on the record” that could be cited in this report.

Adalytics asked:

What is Basis Technologies' philosophy about serving ads to declared bots operating out of known data center IPs? Should DSPs filter out and avoid bidding on ad auctions served to declared bots on the IAB Bots & Spiders List operating out of known data center IPs? What is Basis Technologies’ technical approach to ensuring that no ads were seen serving to a declared bot in Oct 2024 from [HTTP Archive] ?

Chris Coupland, Basis Technologies’ platform operations director, responded:

Basis strives to remove all invalid traffic (IVT) as per MRC guidelines, which requires ad tech vendors that measure advertising traffic to filter it from their reporting. Basis will not bid on impressions deemed to be IVT. The most basic form of this, general invalid traffic (GIVT), consists of the type referenced -- traffic originating in datacenters, user agents declared in the IAB's Spiders and Bots list, unknown user agents, etc.

Basis employs multiple methods of filtering or blocking GIVT. We maintain blocklists of known datacenter IP addresses, sourced from multiple vendors, and actively enforce the IAB's Spiders and Bots list. We’ve had these methods in place for many years and are why the 'academic bot' doesn't show up in Adalytics reporting.

The next two sections of this report will go into greater detail of how the two demand side platforms that claim to partner with HUMAN Security - Google DV360 and Trade Desk - were observed serving ads for many US Government and Fortune 500 advertisers to bots.

Google DV360 ads served to bots

According to Google’s public documentation, Google has “integrated with HUMAN, which serves as an extra safety check for our invalid traffic defenses. This additional safeguard requires no additional configuration or changes from you.

Source: Google Campaign Manager 360 Help - https://support.google.com/campaignmanager/answer/6076504?hl=en

Many brands were observed as having their ads served to bots with no pre-existing user IDs operating out of data center IP addresses. In many cases, the bots were declared bots whose user agent has been listed on the IAB Tech Lab’s Bots & Spiders List since 2013.

For example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website mlb.com on October 11th 2024 with no pre-existing user IDs. While the bot was crawling mlb.com, the bot was served an ad for the United States Army. The source code of the US Army ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the ad suggests that the ad was transacted by Omnicom media agency, and the Army ad includes code from verification vendor DoubleVerify. A Nike shoes ad was also served to the bot by DV360 and contained code from DoubleVerify.

Screenshot of a US Army ad served to a bot. The US Army ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security) and Omnicom media agency. The Army ad includes code from DoubleVerify. Source: https://urlscan.io/result/f809c0aa-065e-4e8a-b5b5-53ddd500b6bd/#summary

As a second example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website allmusic.com on July 31st, 2024 with no pre-existing user IDs. While the bot was crawling allmusic.com, it served an ad for the United States Army. The source code of the US Army ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the ad suggests that the ad was transacted by Omnicom media agency, and the Army ad includes code from verification vendor DoubleVerify.

Screenshot of a US Army ad served to a bot. The US Army ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security) and Omnicom media agency. The Army ad includes code from DoubleVerify. Source: https://urlscan.io/result/6f8980c1-3f02-48e1-a8a6-8f2a58b8f55d/#summary

As a third example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website cbr.com on March 12th 2024 with no pre-existing user IDs. While the bot was crawling cbr.com, it was served an ad for the United States Army. The source code of the US Army ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the ad suggests that the ad was transacted by Omnicom media agency, and the Army ad includes code from verification vendor DoubleVerify.

Screenshot of a US Army ad served to a bot. The US Army ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security) and Omnicom media agency. The Army ad includes code from DoubleVerify. Source: https://urlscan.io/result/1ac3687b-4e9b-4c01-967f-a27908e3023c/#summary

As a fourth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website nypost.com on March 25th 2023 with no pre-existing user IDs. While the bot was crawling nypost.com, it was served an “in-stream” video ad for the United States Army in a muted, auto-playing out-stream video slot. The source code of the US Army ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the ad suggests that the ad was transacted by Omnicom media agency, and the Army ad includes code from verification vendor DoubleVerify.

Screenshot of a US Army ad served to a bot. The US Army ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security) and Omnicom media agency. The Army ad includes code from DoubleVerify. Source: https://urlscan.io/result/c536156b-db60-4f6f-ae89-759af7f5f6be/

As a fifth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website nypost.com on April 15th 2023 with no pre-existing user IDs. While the bot was crawling nypost.com, it was served an “in-stream” video ad for the United States Army in a muted, auto-playing out-stream video slot. The source code of the US Army ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the ad suggests that the ad was transacted by Omnicom media agency, and the Army ad includes code from verification vendor DoubleVerify.

A video ad for the Centers for Disease Control and Prevention vaccine.gov was also served to the bot with no user IDs. The clickthrough URL on the vaccines.gov video ad “vaccines.gov/?s_cid=35527:15_Opening-video:prg.dv:p:RG:GM:A25-49:VC3UV:FY2”. The video ad is not visible in the screenshot generated by the bot. The video ad was wrapped in IAS VAST tag wrappers and transacted by Google DV360 DSP and includes references to Reingold media agency.

Screenshot of a US Army ad served to a bot. The US Army ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security) and Omnicom media agency. The Army ad includes code from DoubleVerify. A vaccines.gov video ad wrapped in IAS tags was also served to the bot, but is not visible in the screenshot.The Army video ad can be seen in full here Army video ad - https://perma.cc/PVB3-F5V5. Source: https://urlscan.io/result/99b3d5c7-5881-4b14-930e-d84441a3d179/;

As a sixth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website nypost.com on July 30th 2023 with no pre-existing user IDs. While the bot was crawling nypost.com, it was served an “in-stream” video ad for the United States Army in a muted, auto-playing out-stream video slot. The source code of the US Army ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the ad suggests that the ad was transacted by Omnicom media agency, and the Army ad includes code from verification vendor DoubleVerify.

Screenshot of a US Army ad served to a bot. The US Army ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security) and Omnicom media agency. The Army ad includes code from DoubleVerify. Source: https://urlscan.io/result/1c41cdf0-f6b2-4418-8ab8-d8b7129818f1/#summary

As a seventh example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website nypost.com on December 23rd 2023 with no pre-existing user IDs. While the bot was crawling nypost.com, the bot was served an ad for the United States Army. The source code of the US Army ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the ad suggests that the ad was transacted by Omnicom media agency, and the Army ad includes code from verification vendor DoubleVerify.

Screenshot of a US Army ad served to a bot. The US Army ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security) and Omnicom media agency. The Army ad includes code from DoubleVerify. Source: https://urlscan.io/result/b86e6840-a3db-4adf-8670-4ff5e1325f04/

As an eighth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website 2beesinapod.com on February 17th, 2023 with no pre-existing user IDs. While the bot was crawling 2beesinapod.com, the bot was served an ad for the JPMorgan Chase. The source code of the JPMorgan Chase ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the JPMorgan Chase ad included code from verification vendor IAS.

Screenshot of a JPMorgan Chase ad served to a bot. The JPMorgan Chase appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security). The JPMorgan Chase ad includes code from IAS. Source: https://urlscan.io/result/ee5bacea-02a1-46c8-835f-a193d669368a/#summary

As a ninth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website isabeleats.com on September 16th, 2024 with no pre-existing user IDs. While the bot was crawling isabeleats.com, the bot was served an ad for the JPMorgan Chase. The source code of the JPMorgan Chase ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the JPMorgan Chase ad included code from verification vendor IAS.

Screenshot of a JPMorgan Chase ad served to a bot. The JPMorgan Chase appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security). The JPMorgan Chase ad includes code from IAS. Source: https://urlscan.io/result/41949f81-9da0-46a4-b0e5-29417c7074b1/

As a tenth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website slickdeals.net on May 12th 2023 with no pre-existing user IDs. While the bot was crawling slickdeals.net, the bot was served an ad for the JPMorgan Chase. The source code of the JPMorgan Chase ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the JPMorgan Chase ad included code from verification vendor IAS.

Screenshot of a JPMorgan Chase ad served to a bot. The JPMorgan Chase appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security). The JPMorgan Chase ad includes code from IAS. Source: https://urlscan.io/result/fc75dda4-f6a3-43df-a0c4-c0f969fdfbcc/#summary

As an eleventh example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website eliteprospects.com on November 24th, 2023 with no pre-existing user IDs. While the bot was crawling eliteprospects.com, the bot was served an ad for the American Express. The source code of the American Express ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the American Express ad included code from verification vendor IAS.

Screenshot of an American Express ad served to a bot. The American Express ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security). The American Express ad includes code from IAS. Source: https://urlscan.io/result/96bea181-f0ef-443f-9c15-4a875a07fab5

As a twelfth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website moviemistakes.com on November 21st, 2024 with no pre-existing user IDs. While the bot was crawling moviemistakes.com, the bot was served an ad for the American Express. The source code of the American Express ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the American Express ad included code from verification vendor IAS.

Screenshot of an American Express ad served to a bot. The American Express ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security). The American Express ad includes code from IAS. Source: https://urlscan.io/result/5cbabe27-a5e8-4a59-bb51-55843d0e3333/

As a thirteenth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website 9to5mac.com on November 23, 2024 with no pre-existing user IDs. While the bot was crawling 9to5mac.com, the bot was served an ad for the American Express. The source code of the American Express ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the American Express ad included code from verification vendor IAS.

Screenshot of an American Express ad served to a bot. The American Express ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security). The American Express ad includes code from IAS. Source: https://urlscan.io/result/be2b34f5-bb1e-448e-a2f7-27e4a480896e/

As a fourteenth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website canadianliving.com on November 21st, 2024 with no pre-existing user IDs. While the bot was crawling canadianliving.com, the bot was served an ad for the American Express. The source code of the American Express ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the American Express ad included code from verification vendor IAS.

Screenshot of an American Express ad served to a bot. The American Express ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security). The American Express ad includes code from IAS. Source: https://urlscan.io/result/c8a85af7-bfd0-4b84-ada9-22c6164257ab/

As a fifteenth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website pix11.com on February 18th, 2024 with no pre-existing user IDs. While the bot was crawling February 18th, 2024, the bot was served an “in-stream video” ad for American Express in a muted, out-stream video player. The American Express video ad is not visible above-the-fold in the screenshot generated by URLScan.io’s bot.

The source code of the American Express ad shows that the video ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security. The source code of the American Express ad included code from verification vendor IAS.

A muted, auto-playing Jenny Craig (jennycraig.com) video is visible above the fold in the screenshot taken by the bot.

Screenshot of a Jenny Craig video ad served to a bot. An American Express ad was also served to the bot but is not visible above-the-fold in the screenshot generated by the bot. The American Express video ad creative can be viewed here: https://perma.cc/5QDY-TEMH. The American Express ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security). The American Express ad includes code from IAS. Source: https://urlscan.io/result/8d4cdf91-2e3b-4f18-b52a-79272f71f494/#summary

As a sixteenth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website images5.artemisooz.com on January 30th, 2024 with no pre-existing user IDs. While the bot was crawling images5.artemisooz.com, the bot was served an “in-stream video” ad for the US Department of Veterans Affairs mental health program (mentalhealth.va.gov) in a muted, auto-playing outstream video player. The Veterans Affairs video ad is not visible in the screenshot taken by the URLScan.io bot. The source code of the Department of Veterans Affairs ad shows that the ad was served to the bot (which lacks any pre-existing user identifiers) by Google DV360, which claims to partner with HUMAN Security.

An American Express pop-out banner ad can be seen in the middle of the screenshot taken by the bot.

Screenshot of an American Express ad that was served to a bot. A US Department of Veterans Affairs (mentalhealth.va.gov) video ad was also served to the bot but is not visible above-the-fold in the screenshot generated by the bot. The Veterans Affairs government video ad creative can be viewed here: https://perma.cc/Z76S-68TZ. The Department of Veterans Affairs ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security). Source: https://urlscan.io/result/39d21dbb-7fbc-4dfc-871e-01c060354bb4/#summary

As a seventeenth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website eliteprospects.com on November 20th, 2024 with no pre-existing user IDs. While the bot was crawling eliteprospects.com, the bot was served a “in-stream video” ad for Beam Suntory’s Maker’s Mark alcohol in muted, auto-playing outstream video player. The source code of the muted outstream Beam Suntory’s Maker’s Mark video ads that were served to the bot (which lacks any pre-existing user identifiers) show that the ad were served by Google DV360, which claims to partner with HUMAN Security. The Beam Suntory ads also include code from verification vendor DoubleVerify.

A Telus Digital ad was also served by DV360 to the bot.

Screenshot of a Beam Suntory Maker’s Mark alcohol ad served to a bot. The Beam Suntory ad appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security). The Beam Suntory ad includes code from DoubleVerify. Source: https://urlscan.io/result/d6381fc6-9329-4387-9697-b4cde34bb91a/#summary

As an eighteenth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website eliteprospects.com on November 24th, 2024 with no pre-existing user IDs. While the bot was crawling eliteprospects.com, the bot was served two “in-stream video” ads for Beam Suntory’s Maker’s Mark alcohol in muted, auto-playing outstream video players. The source code of the muted outstream Beam Suntory’s Maker’s Mark video ads that were served to the bot (which lacks any pre-existing user identifiers) show that the ads were served by Google DV360, which claims to partner with HUMAN Security. The Beam Suntory ads also include code from verification vendor DoubleVerify. A Samsung ad was also served to the bot.

Screenshot of two Beam Suntory Maker’s Mark alcohol ads served to a bot. The Beam Suntory ads appears to have been transacted by Google DV360 (which claims to partner with HUMAN Security). The Beam Suntory ad includes code from DoubleVerify. Source: https://urlscan.io/result/fee0244a-600c-49d5-b7a7-4c2d83759bf6/#summary

As a nineteenth example, in the screenshot below, one can see an instance where URLScan.io’s bot crawled the website signup.com on December 12th, 2021 with no pre-existing user IDs. While the bot was crawling signup.com, the bot was shown two Lego ads that were transacted by IPG Matterkind. The source code of the two Lego ads that were shown to the bot (which lacks any pre-existing user identifiers) show that the ad was transacted by Google DV360, which claims to partner with HUMAN Security.

The bot was also served a US Bank video ad that was transacted by Cross Media, Trade Desk DSP, and Sharethrough SSP. Trade Desk and Sharethrough claim to partner with HUMAN Security. The US Bank video ad is not visible in the screenshot taken by the bot.

Screenshot of two Lego ads served to a bot. The Lego ads appear to have been transacted by Google DV360 (which claims to partner with HUMAN Security). Not visible in the screenshot is a US Bank video ad that was served to the bot by Trade Desk and Sharethrough, both of whom claim to partner with HUMAN Security. Source: https://urlscan.io/result/fc75dda4-f6a3-43df-a0c4-c0f969fdfbcc/

As an additional example, in the screenshot below one can see a US Air Force ad that was transacted by Google Ad Manager (GAM) and DV360 to URLScan.io’s bot. The bot was crawling weather.com on February 27th, 2024, when the Air Force ad was served. Weather.com appears to utilize the publisher optimization tools of IAS on its pages, which labeled the bot as “fr=false”. The URLScan.io crawl link is: https://urlscan.io/result/98797fe7-ac9f-4691-bfe4-e6c70c06442c/

Screenshot showing a US Air Force ad that was served by Google Ad Manager (GAM) and DV360 to URLScan.io’s bot whilst the bot was crawling weather.com on February 27th, 2024. Source: https://urlscan.io/result/98797fe7-ac9f-4691-bfe4-e6c70c06442c/

Trade Desk and various SSP ads served to bots

Many ad tech vendors who have made public press releases or statements regarding partnering with HUMAN Security (f/k/a White Ops) have been observed serving Fortune 500 and/or US federal government advertises’ ads to bots, including declared bots whose user-agents have been on the IAB Tech Labs Bots & Spiders List since 2013. Some of these bots also operate out of various well known data center IP addresses, some of which may potentially be on advertising industry reference lists such as the Trustworthy Accountability Group (TAG’s) Data Center IP List.

For example, Index Exchange SSP stated in a press release:

“Index Exchange Partners with White Ops to Deliver Invalid Traffic Protection Against Sophisticated Bots Across All Global Inventory Channels”; “Index Exchange (IX), one of the world’s largest independent ad exchanges, and White Ops, the global leader in collective protection against sophisticated bot attacks and fraud, today announced an expanded partnership that enhances Index Exchange’s global inventory across all channels and regions. Through White Ops’ comprehensive protection, the partnership protects the entirety of Index Exchange’s global inventory. It allows buyers to purchase from IX’s emerging channels, such as mobile app and Connected TV (CTV), with confidence that its supply chain is protected against invalid traffic before a bid request is ever sent to a DSP and made eligible (emphasis added).

Demand side platform Trade Desk announced:

“White Ops and The Trade Desk [...] today announce a landmark deal that completely changes how the advertising industry tackles fraud. White Ops’ Human Verification technology will aim to ensure that there is a human on the other end of every impression served on The Trade Desk that runs through White Ops, in real time, protecting global advertisers and agencies from buying fraudulent impressions [...] For too long, invalid traffic has been part of our industry,” said Jeff Green, CEO and co-founder of The Trade Desk. “There’s no level of fraud that is acceptable. Our partnership with White Ops means that we are the first advertising platform to block non-human impressions at the front door. [...] ​​As part of this initiative, White Ops and The Trade Desk will co-locate servers and data centers in North America, Europe and Asia, to scan every biddable ad impression in real-time. [...] When a non-human impression, known as “Sophisticated Invalid Traffic (SIVT)” within the advertising industry, is identified by White Ops, The Trade Desk will block that impression from serving. The intent is this technology will be applied to every impression The Trade Desk bids on that runs through White Ops, on a global basis. [...] Unlike other solutions, the goal here is to run all impressions across The Trade Desk’s platform through White Ops, not just sampled impressions. Additionally, the Trade Desk has collaborated with the leading SSPs to bring a unified solution to market.” (emphasis added).

Screenshot of HUMAN Security’s website, showing a press release from 2017

Screenshot of Trade Desk’s website, showing the firm partners with HUMAN Securi

Xandr (f/k/a AppNexus) and now owned by Microsoft, announced:

Xandr [...] and HUMAN, a cybersecurity company best known for collectively protecting enterprises from bot attacks, today announced an expansion to HUMAN’s existing pre-bid bot protection within the Xandr platform, to provide an additional layer of protection against fraud and sophisticated bot attacks as emerging formats like connected TV (CTV) scale in availability and demand. This integration connects the full breadth of HUMAN’s Advertising Integrity suite of ad fraud detection and prevention solutions to Xandr’s buy-and sell-side platforms [...] Xandr protects its platform before a bid is even made—including within CTV—to continue delivering success to its publishers and advertisers. HUMAN recently became the first company to receive accreditation from the Media Rating Council (MRC) for pre and post-bid protection against Sophisticated Invalid Traffic (SIVT) for desktop, mobile web, mobile in-app, and CTV.

Many ad tech vendors are part of the HUMAN Security’s “Human Collective.”

List of Flagship and Founding Members of HUMAN Security’s “Human Collective”. Source: https://www.humansecurity.com/company/the-human-collective. Participation in the Human Collective entails: “Technology - Powered by HUMAN's proprietary technology and Modern Defense Platform, we can ensure members are protecting themselves and each other.”

A list of vendors who issued public statements or claimed to have partnered with HUMAN Security (f/k/a “White Ops”) can be seen below.

The Trade Deskhttps://www.humansecurity.com/newsroom/the-trade-desk-partners-with-white-ops-to-become-first-advertising-platform-to-block-fraudulent-impressions-before-they-are-purchased8/31/2017
Microsoft Xandrhttps://www.prnewswire.com/news-releases/appnexus-expands-inventory-quality-initiative-through-partnership-with-white-ops-300666958.html6/15/2018
Index Exchangehttps://www.humansecurity.com/newsroom/index-exchange-partners-with-white-ops8/12/2020
Googlehttps://support.google.com/campaignmanager/answer/6076504?hl=enN/A
Sovrnhttps://www.humansecurity.com/newsroom/the-human-collective-grows-more-than-5x-since-april-launch10/13/2021
Yieldmohttps://www.humansecurity.com/learn/blog/yieldmo-and-white-ops-ensuring-squeaky-clean-mobile-inventory9/10/2019
Sharethroughhttps://www.sharethrough.com/blog/sharethrough-founding-member-of-the-human-collective-in-effort-to-keep-ads-human10/27/2021
Kargohttps://www.humansecurity.com/newsroom/kargos-curated-marketplace-boasts-less-than-0.3-of-invalid-traffic-following-implementation-of-white-ops-mediaguard5/29/2019
Gumgumhttps://www.businesswire.com/news/home/20190529005284/en/GumGum-Partners-with-White-Ops-to-Deliver-Comprehensively-Safe-Ad-Exchange5/29/2019
Triplelifthttps://www.humansecurity.com/newsroom/triplelift-and-white-ops-partner-to-fight-fraud-in-native-advertising5/1/2018
Magnitehttps://www.businesswire.com/news/home/20161220005758/en/Joint-Statement-From-White-Ops-CEO-and-Rubicon-Project-President-on-Successful-Efforts-Countering-Russian-Ad-Fraud12/20/2016
Pubmatichttps://www.humansecurity.com/newsroom/pubmatic-partners-with-white-ops-to-fight-bot-fraud-and-drive-higher-transparency-in-video-inventory12/5/2017
Sonobihttps://www.prnewswire.com/news-releases/sonobi-partners-with-human-formerly-white-ops-to-safeguard-platform-from-sophisticated-bot-fraud-301286928.html5/11/2021
Freewheelhttps://www.humansecurity.com/newsroom/freewheel-and-white-ops-expand-partnership-globally-to-further-deepen-trust-in-premium-video-inventory4/29/2020
Media.nethttps://www.humansecurity.com/newsroom/white-ops-media.net-partnership-extends-pre-bid-fraud-protection-for-brands5/15/2019
Beachfronthttps://www.humansecurity.com/newsroom/white-ops-beachfront5/15/2018
Primishttps://www.humansecurity.com/newsroom/primis-expands-partnership-with-white-ops-in-fight-against-fraud-to-create-a-clean-and-trusted-video-supply-chain9/1/2020
Omnicomhttps://www.humansecurity.com/newsroom/human-formerly-white-ops-launches-the-human-collective-to-protect-against-bot-attacks-and-fraud-across-advertising-supply-chain4/14/2021

In the bar chart below, one can see the relative number of ad impressions served on behalf of various advertisers using the Trade Desk to bots in the data sample via various supply side platforms, ad exchanges, and integrations. The largest relative number of ads purchased via Trade Desk DSP that were served to bots in the data sample were transacted via Trade Desk Open Path. Trade Desk Open Path was observed serving ads to bots on behalf of brands such as the Government of DC, US Hershey’s, US Bank, Progressive, Procter & Gamble (P&G), Hershey’s, WideOpenWest Finance, Kimberly Clark, Samsung, UPS, Coca-Cola, Walmart, McDonald’s, and others.

The second largest relative number of Trade Desk DSP ads served to bots in the sample were transacted by Microsoft Xandr SSP. Xandr has issued multiple announcements about partnering with HUMAN Security (f/k/a White Ops). Xandr was observed serving Trade Desk DSP ads to declared bots (whose user agent is on the IAB Tech Labs Bots & Spider List since 2013) operating out of known data center IP address on behalf of advertisers such as Procter & Gamble, Target, Sling, US Bank, MasterCard, Starbucks, UPS, T-Mobile, Progressive, Pfizer, Ally, Nestle, Cox, Walamrt, Hilton, Unilever, Haleon, Hersheys, Coca-Cola, the Government of Utah, Samsung, and many others.

The third largest relative number of Trade Desk DSP ads served to bots in the sample were transacted by Index Exchange SSP. Index Exchange has issued multiple announcements about partnering with HUMAN Security (f/k/a White Ops), stating that the partnership allows advertisers to buy “with confidence that its supply chain is protected against invalid traffic before a bid request is ever sent to a DSP and made eligible.” Index Exchange was observed serving Trade Desk DSP ads to declared bots (whose user agent is on the IAB Tech Labs Bots & Spider List since 2013) operating out of known data center IP address on behalf of media agencies such as Omnicom, GroupM, MiQ and advertisers such as the United States Navy, the Government of DC, Kenvue, Progressive, US Bank, Hersheys, Procter & Gamble, T-Mobile, Target, Kimberly-Clark, Walmart, the Government of DC, Nestle, Coca-Cola, Pizza Hut, State Farm, Starbucks, American Airlines, Apple, Jimmy Dean, Mars, H&R BLock, Haleon, Bayer, Unilever, Volkswagen, Novo Nordisk, and many others.

Screenshot of URLScan.io’s website, showing a URLScan.io bot crawl from June 16th, 2023. One can see a US Navy ad that was mediated by Trade Desk, Index Exchange, DoubleVerify, and/or HUMAN Security on msn.com.

Screenshot of a Procter & Gamble ad served to a bot with no pre-existing user IDs by Trade Desk and Index Exchange (via PMP deal ID “335308282023991”). URLScan source: https://urlscan.io/result/80b7a9f9-e1c5-4cfb-ac0c-a1978819a6fb/

Some of the Trade Desk ads that Index Exchange transacted to declared bots in data centers appear to have been transacted via PMP deal IDs. For examples, Index Exchange appeared to transacted Trade Desk ads to declared bots on behalf of Home Instead (PMP deal ID “OMGIXMARKETPLACEVIDEO2020”), Statefarm (PMP deal ID “OMGIXMARKETPLACEDISPLAY2020”), Apple (PMP deal “OMGIXMARKETPLACEDISPLAY2020”), and Hallmark (“OMGIXMARKETPLACEDISPLAY2020” and “OMGIXMARKETPLACEVIDEO2020”). Index Exchange also appeared to transacted Trade Desk ads to declared bots in data centers on behalf of T-Mobile (PMP deal IDs “IXCPDTTDTMOBILEDESK”, “IXCPDTTDTMOBILEMOBAPP”, “IXCPDTTDTMOBILEMOBW”, “IXCPDTTDTMOBILEOUT”, and others), Kimberly-Clark (PMP deal IDs “IXCPDMSKCTTDDIS”), Walmart (PMP deal IDs “IXIVPWMRTTECHFSHNDIS ”, “IXIVPWMRTFAMBGTDIS”, “IXCPDTTDUSADISPLAYNATIVEFINANCE”, and others), Coca-Cola (PMP deal IDs “IXIVPCOCACOLA” and “IXIVPCOCACOLARONVIDEO”), Bayer (PMP deal ID “IXCPDBAYERDIS001”), Jimmy Dean (PMP deal ID “GROUPMIXFLOOR1SHEMEDIA”), Unilever (PMP deal IDs “IXIVPUNILEVERHISMOB”, “IXIVPUNILEVERHISDIS”, “IX621858871749986750”, and others), and Mars (PMP deal ID “IXIVPTTDMARSAAOLV”).

For example, in the screenshot below one can observe an IBM ad that was served to URLScan.io’s bot on May 24th, 2024 whilst crawling washingtonpost.com. The IBM ad appears to have been transacted by GroupM media agency, Trade Desk, and Index Exchange (via PMP deal ID “IXCPDGROUPM8BARIBMGLOBAL”). The source code of the IBM ad includes references to “charge-allDoubleVerifyBotAvoidance”, and code from DoubleVerify. IAS’ publisher optimization tool was installed on washingtonpost.com, and classified the URLScan.io bot as “fr=false” (valid, human traffic).

Screenshot of an IBM ad served to a bot with no pre-existing user IDs by Trade Desk and Index Exchange (via PMP deal ID “IXCPDGROUPM8BARIBMGLOBAL”). The source code of the IBM ad includes references to “charge-allDoubleVerifyBotAvoidance”. The publisher’s website invokes the IAS endpoint “pixel.adsafeprotected.com/services/pub”, which returns the classification (“fr=false”, meaning valid, human traffic). URLScan source: https://urlscan.io/result/6c90a702-f944-472c-a91c-267bc1532895/

As another example, in the screenshot below one can observe an T. Rowe Price ad that was served to URLScan.io’s bot on March 18th, 2022 whilst crawling washingtonpost.com. The T. Rowe Price ad appears to have been transacted by GroupM media agency, Trade Desk, and Index Exchange (via PMP deal ID “GROUPMIXFLOOR1WAPO”). The source code of the T. Rowe Price ad includes references to “charge-allDoubleVerifyBotAvoidance”, and code from DoubleVerify. IAS’ publisher optimization tool was installed on washingtonpost.com, and classified the URLScan.io bot as “fr=false” (valid, human traffic).

Screenshot of a T. Rowe Price ad served to a bot with no pre-existing user IDs by GroupM, Trade Desk, and Index Exchange (via PMP deal ID “GROUPMIXFLOOR1WAPO”). The source code of the IBM ad includes references to “charge-allDoubleVerifyBotAvoidance”. The publisher’s website invokes the IAS endpoint “pixel.adsafeprotected.com/services/pub”, which returns the classification (“fr=false”, meaning valid, human traffic). URLScan source: https://urlscan.io/result/4020f54f-d1d2-44c0-84d8-46cadc0032e0/

As another example, in the screenshot below one can observe an Abbvie Botox ad, a Progressive insurance ad, and a Mccormick ad that were served to URLScan.io’s bot on August 5th, 2023 whilst crawling merriam-webster.com. The Abbvie Botox ad appears to have been transacted by Trade Desk and Index Exchange, and its source code includes references to “charge-allDoubleVerifyBotAvoidance”. The Mccormick ad appears to have been transacted by Trade Desk and Triplelift (via PMP deal ID “tlx-48305”), and its source code includes references to “charge-allDoubleVerifyBotAvoidance”. The Progressive insurance ad appears to have been transacted by Trade Desk and Yieldmo, and its source code includes references to Scibids. Two Virgin Voyages ads were also served to the bot and appear to have been transacted by Google DV360.

Multiple DV360 and Trade Desk ads served to URLScan.io’s bot, which was operating out of a data center with no pre-existing user IDs whilst crawling merriam-webster.com on August 5th, 2023. The bot was served two ads for Virgin Voyages by Google, and ads for Mccormick, Progressive, and Abbvie botox. The Mccormick and Abbvie ads’ source code contains references to “charge-allDoubleVerifyBotAvoidance”, whilst the Progressive video ad contains references to “charge-allScibids”. The ads were transacted by Yieldmo, Index Exchange, and TripleLift (via PMP deal ID “tlx-48305”), all of whom have many public statements about partnering with HUMAN Security. Source: https://urlscan.io/result/4bae7001-0889-4c29-9256-1cc88df48182/

As another example, in the screenshot below one can observe an Nestle Purina ad, a US Bank ad, a Target (Ulta) ad, and a Mint Mobile that were served to URLScan.io’s bot on November 24th, 2023 whilst crawling laughingsquid.com.

The Nestle Purina ad appears to have been transacted by Trade Desk and Triplelift (via PMP deal ID “tlx-49380”). The US Bank ad appears to have been transacted by Trade Desk Open Path.

The Target (Ulta) ad appears to have been transacted by Trade Desk and Yieldmo (via PMP deal ID “yieldmo_target_essenc_ulta_highimpactdisplay_contextual_ttd”).

A Mint Mobile ad appears to have been transacted by Google DV360 and TrustX (via PMP deal ID “G-TRUSTX-c0ae56c742a94e06”).

Multiple DV360 and Trade Desk ads served to URLScan.io’s bot, which was operating out of a data center with no pre-existing user IDs whilst crawling laughingsquid.com on November 23rd, 2023. The bot was served a Mint Mobile video ad by DV360 and TrustX (via PMP deal ID “G-TRUSTX-c0ae56c742a94e06”). The bot was also served ads for Nestle Purina, US Bank, and Target (Ulta). The ads were transacted by TripleLift (via PMP deal ID “tlx-49380”), Trade Desk Open Path, and Yieldmo (via PMP deal ID “yieldmo_target_essence_ulta_highimpactdisplay_contextual_ttd”) respectively, all of whom have many public statements about partnering with HUMAN Security. Source: https://urlscan.io/result/8d65a58f-bb18-41ba-96e3-c89d6fd70575/

As another example, six Samsung ads were served to URLScan.io’s bot whilst it was crawling foodiecrush.com on December 21st, 2021 from an Amazon Web Services (AWS) data center IP address with no pre-existing user IDs. Only two of the six Samsung ads are visible above-the-fold in the screenshot. The six Samsung ads were transacted by Trade Desk, Index Exchange, Pubmatic, Magnite, and Gumgum. All five vendors claim to partner with HUMAN Security. The source code of the Samsung ads includes references to “charge-allIntegralSuspiciousActivity” and Javascript from IAS.

Six Samsung ads served by Trade Desk to URLScan.io’s bot, which was operating out of an Amazon Web Services (AWS) data center with no pre-existing user IDs whilst crawling foodiecrush.com on December 21st, 2021. Only two of the six Samsung ads are visible above the viewport in the screenshot. The six Samsung ads were transacted by Pubmatic, Index Exchange, Magnite, and GumGum, all of whom have many public statements about partnering with HUMAN Security. The source code of the six Samsung ads that were served to a bot in an AWS data center contain references to “charge-allIntegralSuspiciousActivity” and Javascript from IAS. Source: https://urlscan.io/result/8dfbe2e1-d534-481d-904b-059c1a59f0fd/

As another example, a CVS ad, a Kimberly-Clark Depend ad, a Unilever Dove ad, and two James Hardie ads were served to URLScan.io’s bot whilst it was crawling speedtest.net on March 17th, 2024.

The CVS ad appears to have been transacted by Trade Desk and Magnite (via PMP deal ID “GB-GroupBlack-11), and its source code includes references to “charge-allIntegralPageQuality” and code from IAS.

The Kimberly-Clark Depend ad that was served to the bot appears to have been transacted by Trade Desk and Magnite, both of whom claim to partner with HUMAN Security. The source code of the Kimberly-Clark ad includes reference to “charge-allIIntegralQualitySync” and code from IAS.

The Unilever Dove ad that was served to a bot appears to have been transacted by Trade Desk and Magnite (via PMP deal ID “MGNI-MD-9-5198”). The Unilever Dove ad includes references to “charge-allDoubleVerifyBotAvoidance” and code from DoubleVerify.

The two James Hardie ads that were served to the bot appear to have been transacted by Publicis-Groupe owned Epsilon and Magnite.The source code of the James Hardie ads includes references to “Apex”. According to Ad Age, “Apex Exchange, is making principal investments in both traditional and digital media inventory that it then resells to clients [...] “We are very clear with clients where we've made an investment." Those clients waive their right to transparency on costs and fees.” The James Hardie source code includes references to IAS.

Multiple ads served by Trade Desk to URLScan.io’s bot, which was operating with no pre-existing user IDs whilst crawling speedtest.com on March 17th, 2024. One Kimberly-Clark Depend ad was transacted by Magnite to the bot, and the source code of the Kimberly-Clark Depend ad includes references to “charge-allIntegralQualitySync”. A Unilever Dove ad was transacted to the bot by Magnite via PMP deal ID “MGNI-MD-9-5198”. The source code of the DoubleVerify ad that was served to a bot includes references to “charge-allDoubleVerifyBotAvoidance”. A CVS ad transacted by Magnite to the bot via PMP deal ID “GB-GroupBlack-11”. The source code of the CVS ad includes references to “charge-allIntegralPageQuality”. In addition the three Trade Desk transacted ads, two ads for James Hardie were transacted by Publicis Groupe-owned Epsilon and Magnite. The source code of the James Hardie ads includes references to “Apex”. According to Ad Age, “Apex Exchange, is making principal investments in both traditional and digital media inventory that it then resells to clients [...] “We are very clear with clients where we've made an investment." Those clients waive their right to transparency on costs and fees.” The James Hardie source code includes references to IAS. Source: https://urlscan.io/result/344a1261-ab62-48e9-be81-2b66f7f4101c/

As another example, two Unilever Dove ads were served to URLScan.io’s bot whilst it was crawling merriam-webster.com on November 19th, 2024 with no pre-existing user IDs. The two Unilever Dove ads were transacted by Trade Desk and Magnite (via PMP deal ID “TRD-9262-00da4”). The source code of the Unilever Dove ads includes code from DoubleVerify.

Two Unilever Dove ads served by Trade Desk to URLScan.io’s bot, which was operating with no pre-existing user IDs whilst crawling merriam-webster.com on November 19th, 2024. The two Unilever Dove ads were transacted by Magnite (via PMP deal ID “TRD-9262-00da4”). The Unilever Dove ads contain code from DoubleVerify. One can also see ads for Walmart and University of Phoenix (transacted by Google). Source: https://urlscan.io/result/287c6db3-5576-41b5-84c6-f1a3398de300/

As another example, a Hershey’s Kisses ad was served to URLScan.io’s bot whilst it was crawling mydominicankitchen.com December 6th, 2023 with no pre-existing user IDs from a data center in Australia. The Hershey’s Kisses ad was transacted by Trade Desk and Index Exchange (via PMP deal ID “20809072023991”), both of whom claim to partner with HUMAN Security. The source code of the Hershey’s ad includes references to IAS.

The Hershey’s ad appears to have been targeted to a consumer in Wilmington, Delaware, USA, but was served to a bot in a data center in Australia. This may be due to an outdated ip-to-geolocation lookup database used by one of the ad tech vendors.

URLScan.io’s bot crawled mydominicankitchen.com on December 6th, 2023 with no pre-existing user IDs from a Latitude.sh data center in Sydney, Australia 🇦🇺 ; Hersheys Kisses ad that was meant to be served in USA 🇺🇸 served by Trade Desk and Index Exchange (via PMP deal ID “20809072023991”) to a bot operating out of a data center in Australia 🇦🇺. Both ad tech vendors claim to partner with HUMAN Security. The ad tech vendors inferred the bot’s geolocation as being in Wilmington, Delaware 🇺🇸 - possibly due to an outdated ip-to-geolocation lookup table (the IP address was last located in the USA in April 2020). The Hershey’s ad that was served to a bot includes code from and references to IAS.

https://urlscan.io/result/0e40e3af-76ae-43cb-84cf-d7b2cbfe3440/#summary

As another example, a Hershey’s ad was served to URLScan.io’s bot whilst it was crawling thewest.com.au on July 20th, 2023 with no pre-existing user IDs from a data center in Australia. The Hershey’s ad was transacted by Trade Desk and Index Exchange, both of whom claim to partner with HUMAN Security. The source code of the Hershey’s ad includes references to IAS and “charge-allIntegralQualitySync”.

The Hershey’s ad appears to have been targeted to a consumer in West Monroe, New York, USA, but was served to a bot in a data center in Australia. This may be due to an outdated ip-to-geolocation lookup database used by one of the ad tech vendors.

URLScan.io’s bot crawled thewest.com.au on July 20th, 2023 with no pre-existing user IDs from a Latitude.sh data center in Sydney, Australia 🇦🇺. Hershey's ad (transacted via Horizon Media) that was meant to be served in USA 🇺🇸 served by Trade Desk and Index Exchange to a bot operating out of a data center in Australia 🇦🇺. Both ad tech vendors claim to partner with HUMAN Security. The ad tech vendors may have inferred the bot’s geolocation as being in West Monroe, New York 🇺🇸 - possibly due to an outdated ip-to-geolocation lookup table (the IP address was last located in the USA in April 2020). The Hershey’s ad that was served to a bot includes code from IAS and references to “charge-allIntegralQualitySync”.

https://urlscan.io/result/0193480a-548d-420d-ba00-8b676628baf2/#summary

As another example, in the screenshot below one can see an instance where New York City Police (NYPD) ads were served to URLScan.io’s bot on June 2, 2024 whilst the bot was crawling applefitnessforum.co (URLScan.io link: https://urlscan.io/result/37d33b15-1345-4637-a664-86897279f57b/). The ad was transacted by Trade Desk and Magnite, both of whom claim to partner with HUMAN Security.

Screenshot of a New York City Police Department (NYPD) ad served to a bot. The ad was served to URLScan.io’s bot on June 2, 2023, whilst the bot was crawling applefitnessforum.com. The ad was transacted by Trade Desk and Magnite, both of whom claim to partner with HUMAN Security. Source: https://urlscan.io/result/37d33b15-1345-4637-a664-86897279f57b/

As another example, a Hershey’s video ad, four Target Ulta ads (only two of which are visible), a Burt’s Bees ad, and a Microsoft edge ad were served to URLScan.io’s bot whilst it was crawling laughingsquid.com on November 25th, 2023 with no pre-existing user IDs from a data center.

The Hershey’s video ad was transacted by Trade Desk and TripleLift. The source code of the Hershey’s ad includes references to IAS, “charge-allIntegralVideoViewability”, and “charge-allIasVideoViewabilityTracking”.

The four Target (Ulta) ads (of which only two are visible in the screenshot generated by the bot) appear to have been transacted by Trade Desk and Yieldmo (via PMP deal ID “yieldmo_target_essence_ulta_highimpactdisplay_contextual_ttd”).

The Microsoft Edge that was served to the bot appears to have been transacted by Trade Desk and Magnite (via PMP deal ID “TRD-9262-00a6c”), both of whom claim to partner with HUMAN Security. The Microsoft ad includes code from IAS.

URLScan.io’s bot crawled laugingsquid.com on November 25th, 2023 with no pre-existing user IDs from a Nexeon data center. Four Target (Ulta) ads served to a bot in a data center with no pre-existing user IDs by Trade Desk and Yieldmo (via PMP deal ID “yieldmo_target_essence_ulta_highimpactdisplay_contextual_ttd”), both of whom claim to partner with HUMAN Security. Only two of the four Target (Ulta) ads are visible above the fold in the screenshot. Microsoft Edge ad served to a bot in a data center with no pre-existing user IDs by Trade Desk and Magnite (via PMP deal ID “TRD-9262-00a6c”), both of whom claim to partner with HUMAN Security. The Microsoft ad includes code from IAS. ​Muted, out-stream Hershey’s video ad served to a bot in a data center with no pre-existing user IDs by Trade Desk and TripleLift, both of whom claim to partner with HUMAN Security. The Hershey’s ad that was served to a bot includes references to “charge-allIntegralVideoViewability” and “charge-allIasVideoViewabilityTracking”, as well as code from IAS.

https://urlscan.io/result/8926d515-487b-4487-89da-c34b3c868fcf/#summary

As another example, two Visa ads were served to URLScan.io’s bot whilst it was crawling portevergladeswebcam.com on April 19th, 2023, from a data center in Germany. The source code of the Visa ads suggest the ads were transacted by Publicis Apex, and were transacted by Trade Desk and Microsoft Xandr SSP, both of whom claim to partner with HUMAN Security. The source code of the Visa ads includes references to “charge-allIntegralSuspiciousActivity” as well as code from IAS.

The URLScan.io bot in Germany did not appear to provide affirmative consent under GDPR for targeted advertising or setting of persistent advertising related identifiers. The source code of the Visa ads says: “utm_term=mobile-retargeting”.

URLScan.io’s bot crawled portevergladeswebcam.com on April 19th, 2023 with no pre-existing user IDs; Two Visa ads transacted by Publicis Apex served to a bot with no pre-existing user IDs by Trade Desk and Microsoft Xandr, both of whom claim to partner with HUMAN Security. The ads were served in Germany, and the bot did not take any actions to provide affirmative consent for behavioral targeting or tracking under GDPR. The Visas ad that was served to a bot includes references to “charge-allIntegralSuspiciousActivity” , as well as code from IAS. The source code of VISA and Publicis Apex ads - which were served to bots with no pre-existing user IDs that did not take any actions to provide consent under GDPR says: “utm_term=mobile-retargeting”

Source: https://urlscan.io/result/af8c34e3-f3a7-4aa1-a60c-b9e9911cc3d1/ ; Source: https://urlscan.io/responses/36fd53d0c81ebf9bb5075c7d8987dd967915b191e7cfad3b96a285022e488e25/

As another example, three Singaporean government ads were served to URLScan.io’s bot whilst it was crawling wiki.sg on January 26th, 2024. The Singaporean government ads appear to have been served by Trade Desk and Pubmatic (via PMP deal ID “PM-VGNP-7498”), both of whom claim to partner with HUMAN Security.

Three Singaporean government ads served to a bot by Trade Desk and Pubmatic. Source: https://urlscan.io/result/77f70125-6ccb-4187-87ed-35c0341f6ee5/

As another example, a Procter & Gamble Metamucil ad was served to URLScan.io’s bot whilst it was crawling file.io on August 8th, 2024. The P&G Metamucil ads appear to have been served by Trade Desk and Magnite, both of whom claim to partner with HUMAN Security.

Source: https://urlscan.io/result/edb8e60d-e0c4-4e1d-b62e-62c82cb36959/

As another example, two New York state government ads and two T-Mobile ads were served to URLScan.io’s bot whilst it was crawling bronx.news12.com on October 21st, 2023. The New York state government ads were transacted by Trade Desk, Triplelift and Microsoft Xandr SSP, all three of whom claim to partner with HUMAN Security. The source code of the New York state government ads includes references to “charge-allDoubleVerifyBotAvoidance” and code from DoubleVerify. The two T-Mobile ads appear to have been served by Trade Desk and Triplelift (via PMP deal ID “tlx-42874”). The T-Mobile ads include code from IAS.

Two New York state government ads and two T-Mobile ads served to URLScan.io’s bot by Trade Desk, Triplelift, and Xandr. The New York state government ads include references to “charge-allDoubleVerifyBotAvoidance”. Source: https://urlscan.io/result/50582b1b-72dd-4288-afd7-1730d4ad234b/

As another example, two US Bank ads were served to URLScan.io’s bot whilst it was crawling tabs.ultimate-guitar.com on December 12th, 2023 from a data center in Australia. The two US Bank ads were transacted by Trade Desk and Magnite, both of whom claim to partner with HUMAN Security.

The US Bank ads appear to have been targeted to a consumer in Wilmington, Delaware, USA but were served to a bot in a data center in Australia. This may potentially be due to the ad tech vendors using an out-dated ip-to-geolocation lookup table.

URLScan.io’s bot 🤖 crawled tabs.ultimate-guitar.com with no pre-existing user IDs on December 12th, 2023 - from a Latitude.sh data center in Sydney, Australia 🇦🇺. Two US Bank ads that may have been targeted to consumers in the USA 🇺🇸 served to URLScan.io’s bot 🤖 with no pre-existing user IDs operating out of a data center in Australia 🇦🇺. The US Bank ads were served by Trade Desk and Magnite. Both vendors allegedly partner with HUMAN Security. The ad tech vendors may have inferred the bot’s geolocation as being in Wilmington, Delaware 🇺🇸 - possibly due to an outdated ip-to-geolocation lookup database.

Source: https://urlscan.io/result/bb1f7a43-434f-4024-8356-7f1ae8f4926c/

As another example, an Indiana state government ad, a Procter & Gamble Pantene video ad, and a P&G Native deodorant ad were served to URLScan.io’s bot whilst it was crawling yahoo.com on March 13th, 2024. The Indiana state government ad was served by Trade Desk and Magnite, and includes references to IAS and “charge-allIntegralPageQuality.”

The two muted, auto-playing out-stream P&G ads were transacted by Trade Desk and Magnite, and include references to “charge-allQAVideoViewability”, “tapad”, “comscore_dr”, “alliancelotame”, “lotadirect”, and “charge-allVideoCompletionRate”.

Indiana state government and two P&G muted, auto-playing, outstream video ads served to a bot with no pre-existing user identifiers. Source: https://urlscan.io/result/dc079ffb-4961-4f37-bb38-edc4a45916a6/

As another example, a Blue Diamond Growers video ad, P&G Gillette video ad, Microsoft Edge video ad, and TeamSnap ad were served to URLScan.io’s bot whilst the bot was crawling fantasyfootballcalculator.com on May 26th, 2023 from a data center in Australia.

The Blue Diamond, P&G Gillette, and TeamSnap ads were transacted by Trade Desk DSP and by Kargo, Yieldmo, and Trade Desk Open Path (respectively). The ads appear to have been targeted to a consumer in Sneads, Florida but were served to a bot with no user identifiers in a data center in Australia, possibly due to one or more ad tech vendors using an out-dated IP address to geolocation lookup table.

The muted, auto-playing, outstream Blue Diamond Growers video ad includes code from IAS and references to “charge-allIntegralVideoSuspiciousActivity”. The muted, auto-playing, out-stream P&G Gillette video ad that was served to a bot with no pre-existing user identifiers includes references to “charge-allQAVideoViewability”, “q-alliance”, and “charge-allNielsenDARMeasurement”. The TeamSnap ad includes references to “charge-allAppsflyer”.

A Blue Diamond Growers, a P&G Gillette, and a TeamSnap ad served to a bot in a data center in Australia with no pre-existing user identifiers. The source code of the Blue Diamond Growers ad includes references to “charge-allIntegralVideoSuspiciousActivity”. Source: https://urlscan.io/result/d6ac7a4f-6414-4e2f-804e-296cdd44eb39/

As another example, a Bank of America Merrill Lynch ad was served to URLScan.io’s bot whilst the bot was crawling jsonline.com on January 24th, 2022 from a data center. The Bank of America ad appears to have been transacted by Omnicom media agency and was served by Trade Desk and Sharethrough, all three of whom claim to partner with HUMAN Security, The Bank of America ad includes references to DoubleVerify and “charge-allDoubleVerifyBotAvoidance”.

The publisher jsonline.com appears to have configured IAS’ publisher optimization tool on its page. The IAS tool appears to have classified the URLScan.io bot as “fr=false” (valid, human traffic).

Bank of America Merrill Lynch ad was served to URLScan.io’s bot whilst the bot was crawling jsonline.com on January 24th, 2022 from a data center. The Bank of America ad appears to have been transacted by Omnicom media agency and was served by Trade Desk and Sharethrough, both of whom claim to partner with HUMAN Security, The Bank of America ad includes references to DoubleVerify and “charge-allDoubleVerifyBotAvoidance”.

Source: https://urlscan.io/result/1fe20077-1bab-415e-9f37-dbbd317563e9/

As another example, a Bank of America Merrill Lynch ad was served to URLScan.io’s bot whilst the bot was crawling healthline.com on February 5th, 2022 from a data center. The Bank of America ad appears to have been transacted by Omnicom media agency and was served by Trade Desk and Triplelift (via PMP deal ID “tlx-36443”), all three of whom claim to partner with HUMAN Security, The Bank of America ad includes references to DoubleVerify and “charge-allDoubleVerifyBotAvoidance”.

Bank of America Merrill Lynch ad was served to URLScan.io’s bot whilst the bot was crawling healthline.com on February 5th, 2022 from a data center. The Bank of America ad appears to have been transacted by Omnicom media agency and was served by Trade Desk and Triplelift, both of whom claim to partner with HUMAN Security, The Bank of America ad includes references to DoubleVerify and “charge-allDoubleVerifyBotAvoidance”.

Source: https://urlscan.io/result/f2f44b2a-0333-4446-9909-512834d0581b/

As another example, a Diageo alcohol ad was served to URLScan.io’s bot whilst the bot was crawling britannica.com on July 18th, 2024. The Diageo ad appears to have been served by Trade Desk and Sovrn, both of whom claim to partner with HUMAN Security, The Diageo alcohol ad includes references to DoubleVerify and “charge-allDoubleVerifyBotAvoidance”.

Diageo alcohol ad was served to URLScan.io’s bot whilst the bot was crawling britannica.com on July 18th, 2024. The Diageo ad appears to have been served by Trade Desk and Sovrn, both of whom claim to partner with HUMAN Security, The Diageo alcohol ad includes references to DoubleVerify and “charge-allDoubleVerifyBotAvoidance”. Source: https://urlscan.io/result/4761b152-a29e-4cc0-85c1-62205f764254/

As another example, three T-Mobile ads and a St George’s University ad were served to URLScan.io’s bot whilst the bot was crawling eliteprospects.com on November 24th, 2024. The three T-Mobile ads appear to have been transacted by Trade Desk and Magnite (via PMP deal ID “MGNI-AP-5083”), both of whom claim to partner with HUMAN Security. The St. George’s University video ad appears to have been served by Google DV360.

Three T-Mobile ads and a St George’s University ad were served to URLScan.io’s bot whilst the bot was crawling eliteprospects.com on November 24th, 2024. The three T-Mobile ads appear to have been transacted by Trade Dek and Magnite (via PMP deal ID “MGNI-AP-5083”), both of whom claim to partner with HUMAN Security. The St. George’s University video ad appears to have been served by Google DV360. Source: https://urlscan.io/result/479ff2c1-0b40-42be-9bbf-2b9dc6aa6cfe/#summary

As another example, a JPMorgan Chase ad and a LifeStyleAlive law firm ad was served to URLScan.io’s bot whilst the bot was crawling iheartcraftythings.com on February 29th, 2024 from a data center in Australia. The JPMorgan Chase ad appears to have been served by Trade Desk and Microsoft Xandr SSP (via PMP deal ID “1691758”), both of whom claim to partner with HUMAN Security. The source code of the JPMorgan Chase ad includes code from IAS and references to “charge-allIntegralPageQuality”.

The JPMorgan Chase ad appears to have been targeted to a consumer in Houston, Texas when it was served to a bot in a data center in Australia. It is possible that one or more ad tech vendors are using an out-dated IP address to geolocation lookup table.

URLScan.io’s bot crawled iheartcraftythings.com with no pre-existing user IDs on February 29th, 2024 - from a Latitude.sh data center in Australia 🇦🇺. JPMorgan Chase ad that may have been targeted to consumers in the USA 🇺🇸 served to a URLScan.io’s bot 🤖 with no pre-existing user IDs operating out of a data center in Australia 🇦🇺. The JPMorgan Chase ad was served by Trade Desk and Microsoft Xandr (via PMP deal ID “1691758”). Trade Desk and Microsoft Xandr allegedly partner with HUMAN Security. The ad tech vendors may have inferred the bot’s geolocation as being in Houston, Texas - possibly due to an outdated ip-to-geolocation lookup database 🇺🇸. The source code of the JPMorgan Chase ads includes references to “charge-allIntegralPageQuality” and code from IAS. LifestyleAlive law firm ad that served to a URLScan.io’s bot 🤖 with no pre-existing user IDs operating out of a data center in Australia 🇦🇺.

Source: https://urlscan.io/result/9f268d38-f2d1-4d2f-ba4a-cf7ef10bc402/#summary

As another example, a muted, auto-playing, outstream HP video ad was served to URLScan.io’s bot whilst the bot was crawling yahoo.com on September 15th, 2022. The HP video ad appears to have been served by Trade Desk and Media.net (via PMP deal ID “mn1_HP_ttd_v”), both of whom claim to partner with HUMAN Security, The HP ad includes code from IAS.

A muted, auto-playing, outstream HP video ad was served to URLScan.io’s bot whilst the bot was crawling yahoo.com on September 15th, 2022. The HP video ad appears to have been served by Trade Desk and Media.net (via PMP deal ID “mn1_HP_ttd_v”), both of whom claim to partner with HUMAN Security, The HP ad includes code from IAS. Source: https://urlscan.io/result/e73933e3-5383-4e6e-8d6b-a8b70553edbb/

As another example, a Diageo Smirnoff alcohol ad was served to URLScan.io’s bot whilst the bot was crawling eliteprospects.com on November 25th, 2024 with no pre-existing user identifiers. The Diageo alcohol ad appears to have been served by Trade Desk and Triplelift (via PMP deal ID “tlx-63903”), both of whom claim to partner with HUMAN Security, The Diageo alcohol ad includes references to DoubleVerify and “charge-allDoubleVerifyBotAvoidance”. Two HP ads were also served to the bot with no pre-existing user identifiers.

A Diageo Smirnoff alcohol ad was served to URLScan.io’s bot whilst the bot was crawling eliteprospects.com on November 25th, 2024 with no pre-existing user identifiers. The Diageo alcohol ad appears to have been served by Trade Desk and Triplelift (via PMP deal ID “tlx-63903”), both of whom claim to partner with HUMAN Security, The Diageo alcohol ad includes references to DoubleVerify and “charge-allDoubleVerifyBotAvoidance”. Two HP ads were also served to the bot with no pre-existing user identifiers. Source: https://urlscan.io/result/1a9ca0a9-a7c8-4c45-bb8d-d7f33f39c14a/#summary

As another example, a Nextsense.org.au (hearing aids) ad and two T-Mobile ads were served to URLScan.io’s bot whilst the bot was crawling latimes.com on October 4th, 2023 with no pre-existing user identifiers from a data center in Australia. The T-Mobile ads are not viewable above the fold in the screenshot generated by the bot. The T-Mobile ads were served to the bot by Trade Desk and Magnite (via PMP deal ID “TRD-20520-0007d” and “MGNI-AP-5083”), both of whom claim to partner with HUMAN Security.

The T-Mobile ads appear to have been targeted to a consumer in Prescott, Arizona, USA when they were served to a bot in a data center in Australia. This may be due to one or more ad tech vendors using an outdated IP address to geolocation lookup database.

A Nextsense.org.au (hearing aids) ad and two T-Mobile ads were served to URLScan.io’s bot whilst the bot was crawling latimes.com on October 4th, 2023 with no pre-existing user identifiers from a data center in Australia. The T-Mobile ads are not viewable above the fold in the screenshot generated by the bot. The T-Mobile ads were served to the bot by Trade Desk and Magnite (via PMP deal ID “TRD-20520-0007d” and “MGNI-AP-5083”), both of whom claim to partner with HUMAN Security. The T-Mobile ads appear to have been targeted to a consumer in Prescott, Arizona, USA when they were served to a bot in a data center in Australia. This may be due to one or more ad tech vendors using an outdated IP address to geolocation lookup database. Source: https://urlscan.io/result/b4735ac4-80d9-45f1-bdb4-2f65b280f1e1/#summary

As another example, three Morey’s Pier New Jersey amusement park ads were served to URLScan.io’s bot whilst the bot was crawling tasteofhome.com on December 12th, 2023 with no pre-existing user identifiers from a data center in Australia. The three Morey’s Pier ads were served to the bot by Trade Desk and Magnite, both of whom claim to partner with HUMAN Security.

The three Morey’s Pier ads appear to have been targeted to a consumer in Delaware, USA when they were served to a bot in a data center in Australia. This may be due to one or more ad tech vendors using an outdated IP address to geolocation lookup database.

Three Morey’s Pier New Jersey amusement park ads were served to URLScan.io’s bot whilst the bot was crawling tasteofhome.com on December 12th, 2023 with no pre-existing user identifiers from a data center in Australia. The three Morey’s Pier ads ads were served to the bot by Trade Desk and Magnite, both of whom claim to partner with HUMAN Security. The three Morey’s Pier ads appear to have been targeted to a consumer in Delaware, USA when they were served to a bot in a data center in Australia. This may be due to one or more ad tech vendors using an outdated IP address to geolocation lookup database. Source: https://urlscan.io/result/99e55722-31d7-4e8c-964b-006bc2070c40/#summary

As another example, an Australian Defence Forces (ADF) government ad and an ESPN ad were served to URLScan.io’s bot whilst the bot was crawling thebalanceofmoney.com on December 23rd, 2023 with no pre-existing user identifiers from a data center in Australia. The Australian government ADF ad was served to the bot by Trade Desk and Magnite, both of whom claim to partner with HUMAN Security. The ESPN (owned by Disney) ad appears to have been transacted by Publicis, Padsquad, Yahoo DSP and Triplelift.

An Australian Defence Forces (ADF) government ad and an ESPN ad were served to URLScan.io’s bot whilst the bot was crawling thebalanceofmoney.com on December 23rd, 2023 with no pre-existing user identifiers from a data center in Australia. The Australian government ADF ad was served to the bot by Trade Desk and Magnite, both of whom claim to partner with HUMAN Security. The ESPN (owned by Disney) ad appears to have been transacted by Publicis, Padsquad, Yahoo DSP and Triplelift. Source: https://urlscan.io/result/4dfa40d2-4138-4307-9006-7d105133f833/#summary

As another example, an Novo Nordisk Ozempic ad, an Abbvie Botox ad, and a Capterra ad were served to URLScan.io’s bot whilst the bot was crawling merriam-webster.com on August 5th, 2023 with no pre-existing user identifiers.

The Novo Nordisk Ozempic ad was served to a bot by Trade Desk and Pubmatic (via PMP deal ID “PM-CXIX-9328”), both of whom claim to partner with HUMAN Security. The source code of the Novo Nordisk ad includes references to “charge-allDoubleVerifyBotAvoidance” and code from DoubleVerify.

The Abbvie Botox ad was served to a bot by Trade Desk and Unruly. The source code of the Abbvie Botox ad includes references to “charge-allDoubleVerifyBotAvoidance” and code from DoubleVerify.

A Novo Nordisk Ozempic ad, an Abbvie Botox ad, and a Capterra ad were served to URLScan.io’s bot whilst the bot was crawling merriam-webster.com on August 5th, 2023 with no pre-existing user identifiers. The Novo Nordisk ozempic ad was served to a bot by Trade Desk and Pubmatic (via PMP deal ID “PM-CXIX-9328”), both of whom claim to partner with HUMAN Security. The source code of the Novo Nordisk ad includes references to “charge-allDoubleVerifyBotAvoidance” and code from DoubleVerify. The Abbvie Botox ad was served to a bot by Trade Desk and Unruly. The source code of the Abbvie Botox ad includes references to “charge-allDoubleVerifyBotAvoidance” and code from DoubleVerify. Source: https://urlscan.io/result/96e4ecf1-3efc-429f-967e-42137c5ae745/

As another example, two Hershey's Ice Breakers, three US Bank ads and a Lay’s Baked Oven Potato Chips ad were served to URLScan.io’s bot whilst the bot was crawling lifebyleanna.com on November 5th, 2023 from a data center with no pre-existing user identifiers. Several of the ads are not visible above-the-fold in the screenshot generated by the bot.

The Hershey’s ads were served by Trade Desk and GumGum (via PMP deal ID “14922”), both of whom claim to partner with HUMAN Security. The source code of the Hershey’s ads that were served to a bot include references to IAS and “charge-allIntegralQualitySync”.

The three US Bank ads - none of which are visible above-the-fold in the screenshot generated by the bot - appear to have been transacted by Trade Desk, Index Exchange, and OpenX - all three of whom claim to partner with HUMAN Security. The source code of the three US Bank ads include references to “charge-allGrapeshotVieawbility” and “charge-allGrapeshotDisplayPageQuality”.

The Lay’s Baked Oven Potato Chips ad appears to have been transacted by Amazon DSP.

Two Hershey's Ice Breakers, three US Bank ads and a Lay’s Baked Oven Potato Chips ad were served to URLScan.io’s bot whilst the bot was crawling lifebyleanna.com on November 5th, 2023 from a data center with no pre-existing user identifiers. Several of the ads are not visible above-the-fold in the screenshot generated by the bot. The Hershey’s ads were served by Trade Desk and GumGum (via PMP deal ID “14922”), both of whom claim to partner with HUMAN Security. The source code of the Hershey’s ads that were served to a bot include references to IAS and “charge-allIntegralQualitySync”. The three US Bank ads - none of which are visible above-the-fold in the screenshot generated by the bot - appear to have been transacted by Trade Desk, Index Exchange, and OpenX - all three of whom claim to partner with HUMAN Security. The source code of the three US Bank ads include references to “charge-allGrapeshotVieawbility” and “charge-allGrapeshotDisplayPageQuality”. The Lay’s Baked Oven Potato Chips ad appears to have been transacted by Amazon DSP.

Source: https://urlscan.io/result/04100fdb-fe0d-43db-ac51-56d763eb1b42/#summary

As another example, a Mastercard ad, two Govee ads, and a T-Mobile ad were served to URLScan.io’s bot whilst the bot was crawling laughingsquid.com on December 13th, 2023 from a data center with no pre-existing user identifiers.

The Mastercard ad appears to have been transacted by Trade Desk and Triplelift (via PMP deal “tlx-59219”), both of whom claim to partner with HUMAN Security.

The two Govee ads appear to have been served by Trade Desk and Index Exchange. Only one of the two Govee ads is visible above-the-fold in the screenshot generated by the URLScan.io bot.

The T-Mobile ad appears to have been served to the bot by Trade Desk Open Path. The source code of the T-Mobile ad includes references to IAS and “adbrain”. The T-Mobile ad is not visible above-the-fold in the screenshot generated by URLScan.io’s bot.

A Mastercard ad, two Govee ads, and a T-Mobile ad were served to URLScan.io’s bot whilst the bot was crawling laughinsquid.com on December 13th, 2023 from a data center with no pre-existing user identifiers. Source: https://urlscan.io/result/6a5c8b8f-4234-4951-b7b3-0425c1a5f04f/

As another example, two AmeriHealth Caritas health insurance ads were served to URLScan.io’s bot whilst the bot was crawling 9news.com.au on October 24th, 2024 from a data center in Australia with no pre-existing user identifiers. The AmeriHealth Caritas ads appear to have been transacted by Trade Desk and Index Exchange, both of whom claim to partner with HUMAN Security.

The AmeriHealth Caritas health insurance ads appear to have been targeted to a consumer in Delaware, USA when they were served to a bot in a data center in Australia. It is possible that one or more ad tech vendors are using an out-dated IP address-to-geolocation lookup database.

Two AmeriHealth Caritas health insurance ads were served to URLScan.io’s bot whilst the bot was crawling 9news.com.au on October 24th, 2024 from a data center in Australia with no pre-existing user identifiers. The AmeriHealth Caritas ads appear to have been transacted by Trade Desk and Index Exchange, both of whom claim to partner with HUMAN Security. Source: https://urlscan.io/result/df8894c3-45e8-4bb4-8f4c-bf364d3612dc/

As another example, an American Express video ad was served to URLScan.io’s bot whilst the bot was crawling variety.com on January 27th, 2023 from a data center with no pre-existing user identifiers. The American Express ad appears to have been transacted by IPG media agency, Trade Desk and Google AdX. The American Express ad contains code from IAS.

An American Express video ad was served to URLScan.io’s bot whilst the bot was crawling variety.com on January 27th, 2023 from a data center with no pre-existing user identifiers. The American Express ad appears to have been transacted by IPG media agency, Trade Desk and Google AdX. The American Express ad contains code from IAS. Source: https://urlscan.io/result/d4b0b68f-6636-41d5-b10c-8df5251c160f/

As another example, two Microsoft ads were served to URLScan.io’s bot whilst the bot was crawling economist.com on December 14th, 2017 from a Nexeon data center with no pre-existing user IDs. The Microsoft ads appear to have been served by Trade Desk and Google AdX (via PMP deal ID “366693084984142922”). The source code of the Microsoft ads that were served to a bot in a data center contain references to IAS and to “charge-allIntegralQualitySync”.

Two Microsoft ads were served to URLScan.io’s bot whilst the bot was crawling economist.com on December 14th, 2017 from a Nexeon data center with no pre-existing user IDs. The Microsoft ads appear to have been served by Trade Desk and Google AdX (via PMP deal ID “366693084984142922”). The source code of the Microsoft ads that were served to a bot in a data center contain references to IAS and to “charge-allIntegralQualitySync”. Source: https://urlscan.io/result/f10fb0fa-ea19-4f03-b060-0e773dfebd18/#summary

As another example, a Harvard Business School executive education ad and an American Express ad were served to URLScan.io’s bot whilst it was crawling eliteprospects.com on November 26th, 2024. The Harvard Business School ad appears to have been transacted by Trade Desk and Pubmatic, both of whom partner with HUMAN Security. The Harvard Business School ad that was served to a bot includes code from IAS.

A Harvard Business School executive education ad and an American Express ad were served to URLScan.io’s bot whilst it was crawling eliteprospects.com on November 26th, 2024. The Harvard Business School ad appears to have been transacted by Trade Desk and Pubmatic, both of whom partner with HUMAN Security. The Harvard Business School ad that was served to a bot includes code from IAS. Source: https://urlscan.io/result/f65f52d1-3ba1-42bc-93aa-39d6891e3925/#summary

As another example, a Transportation Security Administration (TSA) Department of Homeland Security (DHS) ad that was served to URLScan.io’s bot whilst it was crawling rapidcityjournal.com on December 13th, 2023. The TSA ad appears to have been transacted by Trade Desk and Pubmatic, both of whom partner with HUMAN Security.

Screenshot of a Transportation Security Administration (TSA) job recruiting ad, served to a bot in Australia. TSA is part of the US Department of Homeland Security. Source: https://urlscan.io/result/9fbad16b-d196-4f2f-8905-60adfb681016/#summary

Ads served by Trade Desk Open Path to declared bots on the IAB Bots List operating out of known data center IP addresses

Over 1700 different brands and advertisers were observed as transacting via Trade Desk Open Path and having their ads served to declared bots (whose user agent is on the IAB Tech Lab Spiders and Bots list since 2013) operating out of well known data center IP addresses. Open Path is a Trade Desk initiative that circumvents traditional supply side platforms, and gives media buyers direct access to inventory sold by publishers to the Trade Desk DSP.

Screenshot of the Trade Desk Open Path page - https://www.thetradedesk.com/us/our-platform/openpath

It is not immediately apparent from Trade Desk’s public documentation whether Trade Desk Open Path scans every ad impression pre-bid with HUMAN Security’s MediaGuard before deciding to serve a clients’ ad to a declared bot in a data center.

However, Trade Desk previously announced that it would “scan every biddable ad impression in real-time” using HUMAN Security’s tools.

Screenshot of a joint Trade Desk and HUMAN Security (f/k/a White Ops) press release

The list of advertisers whose ads were served by Trade Desk Open Path to declared bots (whose user agent is on the IAB Tech Lab Spiders and Bots list since 2013) operating out of well known data center IP addresses included Progressive insurance, P&G, T-Mobile, Hershey’s, Royal Caribbean, Walmart, US Bank, Wawa, Kimberly-Clark, the Government of DC, Unilever, American Airlines, UPS, Coca-Cola, Subaru, Sling, Samsung, H&R Block, Microsoft, Chewy, Prudential, Abbvie, Sherwin-Williams, Best Buy, Enterprise car rental, Amazon, Google Fiber, Hilton hotels, Nestle, Lockheed Martin, Starbucks, Western Union, US Cellular, Disney, Thomson Reuters, McDonald’s, Cigna, Bayer, American Express, JPMorgan Chase, Henkel, L3Harris, and many others.

For example, T-Mobile had its ads served by Trade Desk Open Path to a declared bot (whose user agent is on the IAB Tech Lab Spiders and Bots list since 2013) operating out of well known data center IP addresses, whilst the bot was crawling the website sweetlittlebluebird.com. The T-Mobile ad was transacting via PMP Deal ID opath-192-4001-48ha53ha64. In another example, T-Mobile ads were served by Trade Desk Open Path to a declared bot whilst the bot was crawling chelsweets.com. The ad transacted via Deal ID opath-202-4001-r0ygm500ex. At least six different deal IDs were observed transacting T-Mobile ads via Trade Desk Open Path to declared bots on the IAB Bots List operating out of known data center IP addresses. These PMP deal IDs included: opath-192-4001-48ha53ha64', 'opath-192-4001-3j23aoyt60', ‘'opath-202-4001-bw61qpmbmy', 'opath-202-4001-82coeyei2u', 'opath-202-4001-r0ygm500ex', and 'opath-220-4001-rfv8cl6e5t'.

As a second example, Hershey’s had its ads served by Trade Desk Open Path to a declared bot (whose user agent is on the IAB Tech Lab Spiders and Bots list since 2013) operating out of well known data center IP addresses, whilst the bot was crawling the website everydaymadefresh.com in December 2022. In another example, Hershey's ads were served by Trade Desk Open Path to a declared bot whilst the bot was crawling asvabpracticetests.com in March, 2024. The ad was transacted via PMP deal ID “3409-192-all-1”.

As a third example, Procter & Gamble (P&G) had its ads served by Trade Desk Open Path to a declared bot (whose user agent is on the IAB Tech Lab Spiders and Bots list since 2013) operating out of well known data center IP addresses, whilst the bot was crawling the website reciperunner.com in November 2022. The P&G ad was for the Mr Clean product line. In another example, P&G ads were served by Trade Desk Open Path to a declared bot whilst the bot was crawling littlelearningcorner.com in October, 2023. The P&G ad was for the Pampers product line.

Private marketplace (PMP) and other curated ad deals served to bots by vendors which allegedly partner with HUMAN Security

It appears that many brands who were transacting ad inventory via private marketplace (PMP) or other types of curated ad deals had their ads served to bots.

For example, Procter & Gamble (P&G) had PMPs transacting through Yieldmo and Trade Desk served to a declared bot (whose user agent is on the IAB Tech Lab Spiders and Bots list since 2013) operating out of well known data center IP addresses, via PMP deal IDs “yieldmo_p&g_fabriccare_instreamvideo_ctx_ttd” and “yieldmo_p&g_fabriccare_outstreamvideo_ron_ttd”. The P&G ads were for the Tide product line, and were served to bots crawling websites such as thetravellingtom.com, foodnouveau.com, and vintagerevivals.com. Procter & Gamble (P&G) had ads for Charmin served to declared bots operating out of data centers by Connatix and Trade Desk. The P&G Charmin ads were transacted via PMP deal IDs named “Connatix-TTD-PGDirect-Charmin-2023-Mobile”, “Connatix-TTD-PGMC-PMP-USA-AA”, and “Connatix-TTD-PG-Charmin”.

Table illustrating via which PMP deals Procter & Gamble ads were served to a declared bot (whose user agent is on the IAB Tech Lab Spiders and Bots list since 2013) operating out of well known data center IP

As a second example, PepsiCo had ads for Gatorade sports drink served to declared bots operating out of data centers by Yieldmo and Trade Desk. The Gatorade ads were transacted via PMP deal ID named “yieldmo_politicalbuyers_standarddisplay_sportspubs_ttd”, and were served to bots crawling websites such as thebusbybabe.sbnation.com and fifaindex.com.

As a third example, alcohol brand Diageo ads served to declared bots operating out of data centers by Yieldmo and Trade Desk. The Diageo ads were transacted via PMP deal IDs named “yieldmo_phdmedia_diageo_donJulio_oustreamvideo_contextual_ttd” and “yieldmo_phd_captainmorgan_instreamvideo_ctx_ttd”, and were served to bots crawling websites such as southernliving.com and vegas.eater.com.

As a fourth example, Hershey’s had ads served to declared bots operating out of data center IPs by GumGum, Index Exchange, Kargo, ShareThrough, Microsoft Xandr, and Trade Desk. The Hershey’s ads served to bots via GumGum were transacted via PMP deal IDs 16650, 22726, 18820, 16660, and others; Hershey’s ads were served to bots via Index Exchange via PMP deal IDs 20809072023991 and 320809082023991; Hershey's ads were served to bots via Kargo PMP deal IDs KRG-ohbn-25066, KRG-2t4t-35418, KRG-asd2-26229, KRG-xq8x-26836, KRG-6wcz-26222, and others; Hershey’s ads were served to bots via ShareThrough PMP deal IDs CFYT1, 982gZ, oPf36, zdMHQ, oefFq, erCDb, YH8M4, and others; Hershey’s PMP deal IDs were served to bots via Microsoft Xandr PMP deal IDs 1297203 and 1512079.

Table illustrating via which PMP deals Hershey’s ads were served to a declared bot (whose user agent is on the IAB Tech Lab Spiders and Bots list since 2013) operating out of well known data center IP

As a fifth example, T-Mobile had ads served to declared bots operating out of data center IPs by Index Exchange, Microsoft Xandr, ShareThrough, Connatix, Kargo, OpenX, Google, and Trade Desk Open Path. The T-Mobile ads served to bots via Index Exchange were transacted via PMP deals IDs IXCPDTTDTMOBILEDESK, IXCPDTTDTMOBILEMOBAPP, IXCPDTTDTMOBILEOUT, and others; the T-Mobile ads served to bots via Kargo were transacted via PMP deal IDs KRG-lhgk-27510, KRG-vkt1-27511, and KRG-78wk-27512, the T-Mobile ads served to bots via Connatix were transacted via PMP deal IDs Connatix-TTD-TMobile-RON-Desktop and Connatix-TTD-TMobile-RON-Mobile; the T-Mobile ads served to bots via ShareThrough were transacted via PMP deal ID 1zzwD.

Table illustrating via which PMP deals T-Mobile ads were served to a declared bot (whose user agent is on the IAB Tech Lab Spiders and Bots list since 2013) operating out of well known data center IP

Data table showing a sample of brands whose ads were served to declared bot in data center server farms, wherein the ad appears to have been mediated via a private marketplace or other type of deal ID.

Ads promoting the Trade Desk itself - in its capacity as an advertiser - served to bots by Trade Desk and other vendors that claim to partner with HUMAN Security

The Trade Desk was observed running ads on various websites promoting its own product and services.

For example, in the screenshots below, one can see a few examples of Trade Desk ad creatives. The first ad appears to be a job recruiting ad, soliciting applications to work for the Trade Desk. The second ad appears to be promoting Trade Desk’s connected television offerings.

Trade Desk was observed serving its own job recruiting and product marketing ads to bots over the course of several years.

For example, in the screenshot below, one can see a Trade Desk ad that says: “UNDERSTAND THE TRUE VALUE OF EACH IMPRESSION” served to URLScan.io’s bot whilst the bot was crawling baby-chick.com on January 22nd, 2024. The Trade Desk ad was served to the bot by Trade Desk and Index Exchange, both of whom claim to partner with HUMAN Security.

A Trade Desk ad that says: “UNDERSTAND THE TRUE VALUE OF EACH IMPRESSION” served to URLScan.io’s bot whilst the bot was crawling baby-chick.com on January 22nd, 2024. The Trade Desk ad was served to the bot by Trade Desk and Index Exchange, both of whom claim to partner with HUMAN Security. Source: https://urlscan.io/result/132e27d6-83f5-4037-9ffc-0c63a1264818/#summary

As another example, in the screenshot below, one can see three Trade Desk job recruiting ads that were served to URLScan.io’s bot whilst the bot was crawling dzone.com on October 6th, 2022. The Trade Desk ads were served to the bot by Trade Desk and Google AdX.

Three Trade Desk job recruiting ads that were served to URLScan.io’s bot whilst the bot was crawling dzone.com on October 6th, 2022. The Trade Desk ad was served to the bot by Trade Desk and Google AdX. Source: https://urlscan.io/result/f94c409a-cb5b-4fd8-ba1d-4ae1aece6df9/#summary

As another example, in the screenshot below, one can see two Trade Desk ads that were served to URLScan.io’s bot whilst the bot was crawling adage.com on March 22nd, 2022. The Trade Desk ads were served to the bot by Trade Desk and Google AdX.

As another example, in the screenshot below, one can see two Trade Desk ads that were served to URLScan.io’s bot whilst the bot was crawling adage.com on March 22nd, 2022. The Trade Desk ads were served to the bot by Trade Desk and Google AdX. Source: https://urlscan.io/result/d60bb120-1f01-42ad-ab05-57ccd2b048a9/#summary

As another example, in the screenshot below, one can see two Trade Desk ads that were served to URLScan.io’s bot whilst the bot was crawling adweek.com on April 11th, 2022. The Trade Desk ads were served to the bot by Trade Desk and Google AdX.

As another example, in the screenshot below, one can see two Trade Desk ads that were served to URLScan.io’s bot whilst the bot was crawling adweek.com on April 11th, 2022. The Trade Desk ads were served to the bot by Trade Desk and Google AdX. Source: https://urlscan.io/result/877b5869-ac7b-4b87-a746-b7bdb0f68998/#summary

As another example, in the screenshot below, one can see a Trade Desk ad that says: “Inside political marketers' strategies to get Democrats elected" that was served to URLScan.io’s bot whilst the bot was crawling thetinylife.com on October 24th, 2024. The Trade Desk ads were served to the bot by Trade Desk and Index Exchange, both of whom claim to partner with HUMAN Security.

A Trade Desk ad that says: “Inside political marketers' strategies to get Democrats elected" that was served to URLScan.io’s bot whilst the bot was crawling thetinylife.com on October 24th, 2024. The Trade Desk ads were served to the bot by Trade Desk and Index Exchange, both of whom claim to partner with HUMAN Security. Source: https://urlscan.io/result/4a0e7aa3-225a-4a0b-94fd-a39850585653/#summary

As another example, in the screenshot below, one can see a Trade Desk video ad that was served to URLScan.io’s bot whilst the bot was crawling reuters.com.com on June 17th, 2022. The Trade Desk ads were served to the bot by Trade Desk and Connatix.

A Trade Desk video ad that was served to URLScan.io’s bot whilst the bot was crawling reuters.com.com on June 17th, 2022. The Trade Desk ads were served to the bot by Trade Desk and Connatix. Source: https://urlscan.io/result/de18baf6-5857-411e-a82b-ff4d456f4d0e/#summary

As another example, in the screenshot below, one can see a Trade Desk video ad that was served to URLScan.io’s bot whilst the bot was crawling reuters.com.com on June 15th, 2022. The Trade Desk ads were served to the bot by Trade Desk and Connatix.

A Trade Desk video ad that was served to URLScan.io’s bot whilst the bot was crawling reuters.com.com on June 15th, 2022. The Trade Desk ads were served to the bot by Trade Desk and Connatix. Source: https://urlscan.io/result/6fc7993a-d069-404f-a5aa-a745a276c0e5/#summary

Research Results: Brands whose ads include references to “charge-allDoubleVerifyBotAvoidance” and had their ads served to bots

Hundreds of major brands whose ads’ source code include references to “charge-allDoubleVerifyBotAvoidance” and code from DoubleVerify appear to have had their ads served to bots in data centers.

Firstly, many brands whose ads’ source includes references to “charge-allDoubleVerifyBotAvoidance” had their ads served to declared bots operating out of well known data center IP addresses. The brands had their ads served to the HTTP Archive bot, whose user agent has been on the IAB Tech Lab Spiders and Bots reference list since 2013. The IP addresses of data centers used by the HTTP Archive are well known, publicized Google Cloud data center IPs.

For example, the United States Navy had ads served over the course of several years to declared bots on the IAB Bots List operating out of known data center IPs, wherein the US Navy ads’ source code included references to “charge-allDoubleVerifyBotAvoidance”. For example, the US Navy had its ads served to a declared bot operating out of a known data center IP in September 2022 on the website infectious-smile.fandom.com. The US Navy ad was transacted by the SSP Kargo, via Kargo PMP deal ID KRG-2pwj-22414. Kargo SSP also partners with HUMAN Security for bot mitigation. The US Navy also had ads served via Index Exchange, Microsoft Xandr, and the Media Grid to declared bots on the IAB Bots List operating out of known data center IPs, wherein the US Navy ads’ source code included references to “charge-allDoubleVerifyBotAvoidance”.

As a second example, Pfizer had paxlovid.com ads served to declared bots on the IAB Bots List operating out of known data center IPs, wherein the Pfizer ads’ source code included references to “charge-allDoubleVerifyBotAvoidance”. For example, in November 2023, Pfizer had its paxlovid.com ads served to a declared bot operating out of a known data center IP whilst the HTTP Archive bot was crawling the website testsworld.net.

As a third example, the Government of DC had dchealth.dc.gov ads served to declared bots on the IAB Bots List operating out of known data center IPs, wherein the DC Government ads’ source code included references to “charge-allDoubleVerifyBotAvoidance”. For example, in August 2022, the DC Government had its dchealth.gov ads served to a declared bot operating out of a known data center IP whilst the HTTP Archive bot was crawling the website selonpepite.canalblog.com.

As a fourth example, Haleon had advil.com ads served to declared bots on the IAB Bots List operating out of known data center IPs, wherein the Haleon ads’ source code included references to “charge-allDoubleVerifyBotAvoidance”. For example, in March 2024, Haleon had its advil.com ads served to a declared bot operating out of a known data center IP whilst the HTTP Archive bot was crawling the website cinderella-phenomenon.fandom.com.

The list of brands whose ads’ source code includes references to “charge-allDoubleVerifyBotAvoidance”, and had their ads served to a declared bot on the IAB Bots List operating out of a known data center IP address included many major brands such as the United States Navy, the Government of the State of New York, the Government of Singapore, Ernst & Young (EY), IBM, Unilever, Bank of America, Adobe, McCormick & Company, Tyson Foods, Wawa, US Cellular, Jimmy Dean, Ozempic (owned by Novo Nordisk), Abbott, T. Rowe Price, Lexus, Abbvie, American Eagle, Chick-fil-A, US Bank, Amazon, Edward Jones, Vauxhall Motors, Subaru, Amgen, Aveva, Loccitane, Ticketmaster, Diageo, Asus, Michelin tire company, Kia, Orkin, Siemens Healthineers, Survey Monkey, Disabled American Veterans (dav.org), Intuit, Panera Bread, CBS Interactive, 1password, Anti-Defamation League (adl.org), AMC Theaters, BMW USA, Paxlovid (owned by Pfizer), Vanguard, Arexvy (owned by GlaxoSmithKline or GSK), Salix Pharmaceuticals, and many others.

Furthermore, many brands whose ads appear to contain code from DoubleVerify and whose source code references “charge-allDoubleVerifyBotAvoidance” appear to have had their ads served to URLScan.io’s bot when the bot was crawling various websites with no pre-existing user IDs.

For example, in the screenshot below, one can see an instance where a T. Rowe Price ad was served to URLScan.io’s bot whilst the bot was crawling washingtonpost.com on March 18th, 2022. The T. Rowe Price ad was transacted by GroupM, Trade Desk, and Index Exchange (via PMP deal ID “GROUPMIXFLOOR1WAPO”). The source code of the T. Rowe Price ad that was served to URLScan.io’s bot includes source code from DoubleVerify and references to “charge-allDoubleVerifyBotAvoidance”.

Screenshot of a T. Rowe Price ad served to URLScan.io’s bot and transacted by GroupM, Trade Desk, and Index Exchange. The source code of the T. Rowe Price ad includes references to “charge-allDoubleVerifyBotAvoidance.” Source: https://urlscan.io/result/4020f54f-d1d2-44c0-84d8-46cadc0032e0/

As another example, in the screenshot below, one can see an instance where an IBM ad was served to URLScan.io’s bot whilst the bot was crawling washingtonpost.com on May 24th, 2024. The IBM ad was transacted by GroupM, Trade Desk, and Index Exchange (via PMP deal ID “IXCPDGROUPM8BARIMGLOBAL”). The source code of the IBM ad that was served to URLScan.io’s bot includes source code from DoubleVerify and references to “charge-allDoubleVerifyBotAvoidance”.

Screenshot of an IBM ad served to URLScan.io’s bot and transacted by GroupM, Trade Desk, and Index Exchange. The source code of the IBM ad includes references to “charge-allDoubleVerifyBotAvoidance.” Source: https://urlscan.io/result/6c90a702-f944-472c-a91c-267bc1532895/

As another example, in the screenshot below, one can see an instance where an IBM ad was served to URLScan.io’s bot whilst the bot was crawling gyytfguy654rffyu7655rrfhhu776444rfgghu7755rrggy7stmmmwwewe5t.hamidstm-stm.workers.dev on March 12th, 2024. The IBM ad was transacted by GroupM, Trade Desk, and Microsoft Xandr (via PMP deal ID “IXCPDGROUPM8BARIMGLOBAL”). The source code of the IBM ad that was served to URLScan.io’s bot with no pre-existing user IDs includes source code from DoubleVerify and references to “charge-allDoubleVerifyBotAvoidance”.

Screenshot of an IBM ad served to URLScan.io’s bot and transacted by GroupM, Trade Desk, and Microsoft Xandr SSP. The source code of the IBM ad includes references to “charge-allDoubleVerifyBotAvoidance.” Source: https://urlscan.io/result/65643690-293b-40e7-bba8-029588d7a677/

As another example, a Bank of America Merrill Lynch ad was served to URLScan.io’s bot whilst the bot was crawling jsonline.com on January 24th, 2022 from a data center. The Bank of America ad appears to have been transacted by Omnicom media agency and was served by Trade Desk and Sharethrough, all three of whom claim to partner with HUMAN Security, The Bank of America ad includes references to DoubleVerify and “charge-allDoubleVerifyBotAvoidance”.

The publisher jsonline.com appears to have configured IAS’ publisher optimization tool on its page. The IAS tool appears to have classified the URLScan.io bot as “fr=false” (valid, human traffic).

Screenshot of a Bank of America Merrill Lynch ad served to URLScan.io’s bot and transacted by Omnicom, Trade Desk, and Sharethrough The source code of the Bank of America ad includes references to “charge-allDoubleVerifyBotAvoidance.” Source: https://urlscan.io/result/1fe20077-1bab-415e-9f37-dbbd317563e9/

As another example, a Bank of America Merrill Lynch ad was served to URLScan.io’s bot whilst the bot was crawling healthline.com on February 5th, 2022 from a data center. The Bank of America ad appears to have been transacted by Omnicom media agency and was served by Trade Desk and Triplelift (via PMP deal ID “tlx-36443”), all three of whom claim to partner with HUMAN Security, The Bank of America ad includes references to DoubleVerify and “charge-allDoubleVerifyBotAvoidance”.

Bank of America Merrill Lynch ad was served to URLScan.io’s bot whilst the bot was crawling healthline.com on February 5th, 2022 from a data center. The Bank of America ad appears to have been transacted by Omnicom media agency and was served by Trade Desk and Triplelift, both of whom claim to partner with HUMAN Security, The Bank of America ad includes references to DoubleVerify and “charge-allDoubleVerifyBotAvoidance”.

Source: https://urlscan.io/result/f2f44b2a-0333-4446-9909-512834d0581b/

As another example, a Diageo alcohol ad was served to URLScan.io’s bot whilst the bot was crawling britannica.com on July 18th, 2024. The Diageo ad appears to have been served by Trade Desk and Sovrn, both of whom claim to partner with HUMAN Security, The Diageo alcohol ad includes references to DoubleVerify and “charge-allDoubleVerifyBotAvoidance”.

Diageo alcohol ad was served to URLScan.io’s bot whilst the bot was crawling britannica.com on July 18th, 2024. The Diageo ad appears to have been served by Trade Desk and Sovrn, both of whom claim to partner with HUMAN Security, The Diageo alcohol ad includes references to DoubleVerify and “charge-allDoubleVerifyBotAvoidance”. Source: https://urlscan.io/result/4761b152-a29e-4cc0-85c1-62205f764254/

As another example, an Novo Nordisk Ozempic ad was served to URLScan.io’s bot whilst the bot was crawling yume2kki.fandom.com on November 7th, 2023 with no pre-existing user identifiers.

The Novo Nordisk Ozempic ad was served to a bot by Trade Desk and Pubmatic (via PMP deal ID “PM-CXIX-9328”), both of whom claim to partner with HUMAN Security. The source code of the Novo Nordisk ad includes references to “charge-allDoubleVerifyBotAvoidance” and code from DoubleVerify. The ad appears to have been intended for a consumer in Delaware, USA but was served to a bot in a data center in Germany.

URLScan.io’s bot 🤖 with no pre-existing user IDs crawled yume2kki.fandom.com from a data center in Frankfurt, Germany 🇩🇪 on November 7th, 2023. Novo Nordisk Ozempic pharmaceutical ad that may have been intended for a consumer in USA 🇺🇸 that was served to a bot with no user IDs in a data center in Germany 🇩🇪. The ad was served by Trade Desk and Pubmatic SSP (via PMP deal ID “PM-CXIX-9328”), both of whom claim to partner with HUMAN Security. The ad tech vendors may have inferred the bot's geolocation as being in Delaware 🇺🇸 - possibly due to an outdated ip-to-geolocation lookup database. The source code of the Novo Nordisk Ozempic ad that was served to a bot in a German data center includes references to “charge-allDoubleVerifyBotAvoidance” and code from DoubleVerify. Coca-Cola ad that may have been intended for a consumer in USA 🇺🇸 that was served to a bot with no user IDs in a data center in Germany 🇩🇪. The ad was served by Trade Desk and Magnite SSP, both of whom claim to partner with HUMAN Security. The ad tech vendors may have inferred the bot's geolocation as being in Delaware 🇺🇸 - possibly due to an outdated ip-to-geolocation lookup database. The source code of the Coca-Cola ad that was served to a bot in a German data center includes references to “charge-allIntegralSuspiciousActivity” and code from IAS.

Source: https://urlscan.io/result/3c636fea-fef8-49d2-9220-287d95ac0625/

As another example, in the screenshot below one can observe ads for the following companies: Abbvie, Progressive insurance and Mccormick that were served to URLScan.io’s bot on August 5th, 2023 whilst crawling merriam-webster.com. The Abbvie Botox ad appears to have been transacted by Trade Desk and Index Exchange, and its source code includes references to “charge-allDoubleVerifyBotAvoidance”. The Mccormick ad appears to have been transacted by Trade Desk and Triplelift (via PMP deal ID “tlx-48305”), and its source code includes references to “charge-allDoubleVerifyBotAvoidance”. The Progressive insurance ad appears to have been transacted by Trade Desk and Yieldmo, and its source code includes references to Scibids.

Multiple DV360 and Trade Desk ads served to URLScan.io’s bot, which was operating out of a data center with no pre-existing user IDs whilst crawling merriam-webster.com on August 5th, 2024. The bot was served two ads for Virgin Voyages by Google, and ads for Mccormick, Progressive, and Abbvie botox. The Mccormick and Abbvie ads’ source code contains references to “charge-allDoubleVerifyBotAvoidance”, whilst the Progressive video ad contains references to “charge-allScibids”. The ads were transacted by Yieldmo, Index Exchange, and TripleLift, all of whom have many public statements about partnering with HUMAN Security. Source: https://urlscan.io/result/4bae7001-0889-4c29-9256-1cc88df48182/

As another example, in the screenshot below one can observe two ads for Ernst & Young (EY) sponsored content on cio.com served to URLScan.io’s bot on June 21st, 2024 whilst crawling companiesmarketcap.com. The EY ads appear to have been transacted by Trade Desk and Microsoft Xandr, both of whom partner with HUMAN Security. The source code of the EY ads includes references to “charge-allDoubleVerifyBotAvoidance.”

Two ads for Ernst & Young (EY) sponsored content on cio.com served to URLScan.io’s bot on June 21st, 2024 whilst crawling companiesmarketcap.com. The EY ads appear to have been transacted by Trade Desk and Microsoft Xandr, both of whom partner with HUMAN Security. The source code of the EY ads includes references to “charge-allDoubleVerifyBotAvoidance.” Source: https://urlscan.io/result/4443b1a9-787e-40ac-b274-1fc697e8ab04/#transactions

As another example, in the screenshot below one can see an ad for Ernst & Young (EY) sponsored content on cio.com served to URLScan.io’s bot on May 28th, 2024 whilst crawling marijuanamoment.net. The EY ads appear to have been transacted by Trade Desk Open Path. The source code of the EY ads includes references to “charge-allDoubleVerifyBotAvoidance.”

An ad for Ernst & Young (EY) sponsored content on cio.com served to URLScan.io’s bot on May 28th, 2024 whilst marijuanamoment.net on May 28th, 2024. The EY ads appear to have been transacted by Trade Desk Open Path. The source code of the EY ads includes references to “charge-allDoubleVerifyBotAvoidance.” Source: https://urlscan.io/result/1795c9b3-19a1-4305-995a-6fef29b195f6/#summary

As another example, in the screenshot below one can see an ad for Ernst & Young (EY) sponsored content on cio.com served to URLScan.io’s bot on June 2nd, 2024 whilst the bot was crawling on marijuanamoment.net. The EY ads appear to have been transacted by Trade Desk and Index Exchange (via PMP deal ID “1101-8-all-1”), both of whom claim to partner with HUMAN Security. The source code of the EY ads includes references to “charge-allDoubleVerifyBotAvoidance.”

An ad for Ernst & Young (EY) sponsored content on cio.com served to URLScan.io’s bot on June 2nd, 2024 whilst the bot was crawling mysanantonio.com. The source code of the EY ads includes references to “charge-allDoubleVerifyBotAvoidance.” Source: https://urlscan.io/result/22e0336f-fae0-4aac-8b50-4d160fa5a9c2/#summary

Research Results: Brands whose ads appear to be mediated by DoubleVerify’s Scibids AI technology and served to bots

In Q2 2023, DoubleVerify acquired a French company called Scibids. According to DoubleVerify, Scibids provides “AI-powered digital campaign optimization.”

“Scibids pulls in impression-level data – including DoubleVerify attention data – to automate and optimize programmatic bids across demand-side platforms”, according to DoubleVerify CEO Mark Zagorski and reporting by AdExchanger. “It looks to minimize the bid to deliver the maximum return,” Zagorski said.

According to AdExchanger, “Scibids’ tech plugs directly into DSPs and gathers first- and third-party data as well as pricing, contextual, performance measurement, programmatic bidstream and measurement data, among others, from clients. Using this data, it dynamically creates “bespoke” bidding algorithms based on specific advertiser KPIs. Scibids is integrated with DSPs such as The Trade Desk, Google DV360, Microsoft’s Xandr and Comcast’s Beeswax. The algo does the trading, bid optimization and decision-making on its own, Zagorski said.”

Source: DoubleVerify

Source: DoubleVerify

The source code of several brands’ ads - whose ads were served to declared bots operating out of known data center IP addresses or to URLScan.io bots - contained references to “charge-allScibids”.

For example, in the screenshot below one can see an instance where URLScan.io’s bot crawled the website allrecipes.com with no pre-existing user identifiers on July 21st, 2024 from a data center. The URLScan.io bot was shown an Ontario Lottery and Gaming Corporation (OLG; owned by the Government of Ontario, Canada) ad. The source code of the OLG ad includes references to “charge-allScibids”, “charge-allDoubleVerifyViewability”, “charge-allOfflineAttributionInferredBrandImpactDisplay”, and “AllianceLotame”.

Screenshot of an Ontario Lottery and Gaming Corporation (OLG; owned by the Government of Ontario, Canada) ad shown to URLScan.io’s bot. The source code of the OLG ad includes references to “charge-allScibids”, “charge-allDoubleVerifyViewability”, “charge-allOfflineAttributionInferredBrandImpactDisplay”, and “AllianceLotame”. Not visible in the screenshot is an ad from Haleon for Gavison heartburn relief. The Haleon Gavison ad was transacted by Trade Desk and Index Exchange (via PMP deal ID “3949-8-all-1”) and includes references to “DoubleVerify”. Source: https://urlscan.io/result/77ed37d4-5b18-4138-ada4-991d520c7fba/

As another example, in the screenshot below one can see an instance where URLScan.io’s bot crawled the website allrecipes.com with no pre-existing user identifiers on July 21st, 2024 from a data center. The URLScan.io bot was shown an Ontario Lottery and Gaming Corporation (OLG; owned by the Government of Ontario, Canada) ad, which is not viewable in the screenshot. The source code of the OLG ad includes references to “charge-allScibids”, “charge-allDoubleVerifyViewability”, “charge-allOfflineAttributionInferredBrandImpactDisplay”, and “AllianceLotame”. The URLScan.io bot was also shown an ad for Google Workspace by Google.

Screenshot of URLScan.io’s bot crawling allrecipes.com on July 21st, 2024, whilst being shown an OLG ad that contains references to “charge-allScibids”. Source: https://urlscan.io/result/272c854e-5ac1-47ac-ace1-3f86fc1a88d7/

As another example, in the screenshot below one can see a Progressive insurance ad that was served to URLScan.io’s bot whilst the bot was crawling the website merriam-webster.com with no pre-existing user identifiers on August 5th, 2023 from a data center. The source code of the Progressive insurance ad includes references to “charge-allScibids”.

Multiple DV360 and Trade Desk ads served to URLScan.io’s bot, which was operating out of a data center with no pre-existing user IDs whilst crawling merriam-webster.com on August 5th, 2024. The bot was served two ads for Virgin Voyages by Google, and ads for Mccormick, Progressive, and Abbvie botox. The Mccormick and Abbvie ads’ source code contains references to “charge-allDoubleVerifyBotAvoidance”, whilst the Progressive video ad contains references to “charge-allScibids”. The ads were transacted by Yieldmo, Index Exchange, and TripleLift, all of whom have many public statements about partnering with HUMAN Security. Source: https://urlscan.io/result/4bae7001-0889-4c29-9256-1cc88df48182/

As another example, in the screenshot below one can see a muted, auto-playing, out-stream video Progressive insurance ad that was served to URLScan.io’s bot whilst the bot with no pre-existing user identifiers was crawling the website laugingsquid.com on July 11th, 2023 from a data center. The source code of the Progressive insurance ad includes references to “charge-allScibids”.

Screenshot of URLScan.io’s bot crawling laugingsquid.com on July 11th, 203, whilst being shown a Progressive insurance ad that contains references to “charge-allScibids”. Source: https://urlscan.io/result/b0bcec35-2305-4945-ac8b-a29237e65bb7/#summary

As another example, in the screenshot below one can see a Progressive insurance ad that was served to URLScan.io’s bot whilst the bot with no pre-existing user identifiers was crawling the website wwwrelyinngparty-okta-loginautooscout24ede.tiny.us on June 27th, 2024 from a data center. The website generated a 500 HTTP status error, meaning the website did not load properly when the bot crawled the page. The source code of the Progressive insurance ad includes references to “charge-allScibids”.

Screenshot of URLScan.io’s bot crawling wwwrelyinngparty-okta-loginautooscout24ede.tiny.us on June 27th, 2024, whilst being shown a Progressive ad that contains references to “charge-allScibids”. Source: https://urlscan.io/result/b321ca56-3ab6-43ed-a12f-3eefec9e69b1/

As another example, in the screenshot below one can see a Progressive insurance ad that was served to URLScan.io’s bot whilst the bot with no pre-existing user identifiers was crawling the website tumblr.com on March 11th, 2024 from a data center. The source code of the Progressive insurance ad includes references to “charge-allScibids”.

Screenshot of URLScan.io’s bot crawling tumblr.com on March 11th, 2024, whilst being shown a Progressive ad that contains references to “charge-allScibids”. Source: https://urlscan.io/result/0207063f-d5b0-49d2-bc41-f6b3fc023710/

As another example, in the screenshot below one can see a Progressive insurance ad that was served to URLScan.io’s bot whilst the bot with no pre-existing user identifiers was crawling the website marijuanamoment.net on January 29th, 2024 from a data center. The source code of the Progressive insurance ad includes references to “charge-allScibids”.

Screenshot of URLScan.io’s bot crawling marijunamoment.net.com on January 29th, 2024, whilst being shown a Progressive ad that contains references to “charge-allScibids”. Source: https://urlscan.io/result/c7cbfaa0-8f6d-4b1f-bbfb-3a640812eb79/

As another example, a Progressive insurance ad was served to URLScan.io’s bot whilst the bot with no pre-existing user identifiers was crawling the website yahoo.com on July 30th, 2023 from a data center. The source code of the Progressive insurance ad includes references to “charge-allScibids”. The Progress ad is not visible above-the-fold in the screenshot generated by URLScan.io’s bot.

Screenshot of URLScan.io’s bot crawling yahoo.com on July 30th, 2023 whilst being shown a Progressive ad that contains references to “charge-allScibids”. The Progressive ad is not visible above-the-fold in the screenshot generated by the bot. Source: https://urlscan.io/result/2e619864-e540-43f0-9bfb-c5afe41661d2/

As another example, a Progressive insurance ad was served to URLScan.io’s bot whilst the bot with no pre-existing user identifiers was crawling the website yahoo.com on January 3rd, 2024 from a data center. The source code of the Progressive insurance ad includes references to “charge-allScibids”. The Progress ad is not visible above-the-fold in the screenshot generated by URLScan.io’s bot.

Screenshot of URLScan.io’s bot crawling yahoo.com on January 3rd, 2024 whilst being shown a Progressive ad that contains references to “charge-allScibids”. The Progressive ad is not visible above-the-fold in the screenshot generated by the bot. Source: https://urlscan.io/result/42fc6156-6d4e-4245-84a9-25c8ca6c695d/

As another example, two Progressive insurance ads were served to URLScan.io’s bot whilst the bot with no pre-existing user identifiers was crawling the website i68796d6e617279o6f7267z.oszar.com on July 22nd, 2024 from a data center. The source code of the Progressive insurance ads includes references to “charge-allScibids” and “adbrain”.

Screenshot of URLScan.io’s bot crawling i68796d6e617279o6f7267z.oszar.com on July 22nd, 2024 whilst being shown two Progressive insurance ads that contain references to “charge-allScibids” and “adbrain”. Source: https://urlscan.io/result/53bef6a0-26f0-4599-bf95-4438756ed01b/#summary

As another example, a Progressive insurance ad was served to URLScan.io’s bot whilst the bot with no pre-existing user identifiers was crawling the website blackclover.fandom.com on April 2nd, 2024 from a data center. The source code of the Progressive insurance ads includes references to “charge-allScibids” and “adbrain”. The publisher fandom.com appears to employ both IAS and DoubleVerify’s publisher optimization tools on its web pages. Fandom lists IAS and DoubleVerify as “partners” on its media kit for prospective advertisers.

Screenshot of URLScan.io’s bot crawling blackclover.fandom.com on April 2nd, 2024 whilst being shown a Progressive insurance ad that contains references to “charge-allScibids” and “adbrain”. The publisher fandom.com appears to have configured the IAS and DoubleVerify publisher optimization tools on its website, and lists IAS and DoubleVerify as “partners” in its media kit for prospective advertisers. Source: https://urlscan.io/result/54af4acf-1183-449d-a939-a79ff4c2df6c/

As another example, two Progressive insurance ads were served to URLScan.io’s bot whilst the bot with no pre-existing user identifiers was crawling the website merriam-webster.com on August 4th, 2023 from a data center. The source code of the Progressive insurance ads includes references to “charge-allScibids”.

Screenshot of URLScan.io’s bot crawling merriam-webster.com on August 4th, 2023 whilst being shown two Progressive insurance ads that contain references to “charge-allScibids”. Source: https://urlscan.io/result/cf08e0fb-a6da-4b80-8fff-a9feac291123/

As another example, in the screenshot below one can see a Progressive insurance ad that was served to URLScan.io’s bot whilst the bot with no pre-existing user identifiers was crawling the website relyingparty-okta-establdedocumentation.tiny.us on June 24th, 2024 from a data center. The website generated a 500 HTTP status error, meaning the website did not load properly when the bot crawled the page. The source code of the Progressive insurance ad includes references to “charge-allScibids” and “adbrain”.

Screenshot of URLScan.io’s bot crawling relyingparty-okta-establdedocumentation.tiny.us on June 24th, 2024 whilst being shown two Progressive insurance ads that contain references to “charge-allScibids” and “adbrain”. Source: https://urlscan.io/result/b9b12626-59a6-47eb-bcd3-40a3ec16573e/

In addition to potential Scibids’ like Progressive having their ads served to bots, there appear to be instances where Scibids own ads for itself were served to bots.

In the screenshot below, one can see an example of a Scibids ads. The Scibids ad says: “The new algorithmic standard for RTB.”

On June 24th 2024, the Scibids ad was served by DV360 to a URLScan.io bot operating out of a i3D.net data center IP address in Japan. The URLScan.io bot was crawling the page xdaforums.com. One can see a detailed recording of the page crawl session and the screenshot the bot took of the page - including the Scibids ad - here: https://urlscan.io/result/3c0b44e7-7bc3-49e4-bb81-80c5b5eb9a7e/. One can see the DV360 bid response here: https://urlscan.io/responses/660da871034e80d36a3e0ebfbe1ad9de0838a7548654dafdf476d69346dcb48b/

Screenshot of a DoubleVerify Scibids ad served to a URLScan.io bot operating out of a i3D.net data center in Japan. URLScan.io source - https://urlscan.io/result/3c0b44e7-7bc3-49e4-bb81-80c5b5eb9a7e

Research Results: Brands whose ads include references to “charge-allIntegralQualitySync” or “charge-allIntegralSuspiciousActivity” and had their ads served to bots

Integral Ad Science offers advertisers various pre-bid targeting segments that inform the bidding activity of their programmatic auction bidders and demand side platforms (DSPs).

According to IAS’ public documentation, IAS offers a pre-bid segment to prevent ad fraud or “suspicious activity”. These segments can help “remove fraudulent bot traffic from your campaign.”

Screenshot of IAS technical documentation, describing IAS “Suspicious Activity” Pre-Bid Segments; Source: https://go.integralads.com/rs/469-VBI-606/images/IAS_Xandr_User_Guide.pdf; Archived: https://perma.cc/93NM-2S2Q

According to IAS’ public documentation, IAS’ Ad Fraud Solution utilized multiple detection methodologies for maximum protection. This three-pillar approach is marketed as being “powered by unmatched scale and machine learning, providing the most accurate detection & prevention.” The IAS Ad Fraud Solution claims to use “rules-based detection [...] to identify any anomalous behavior patterns”, “AI/Machine Learning [...] using big data to detect any hidden, uncommon patterns”, and “malware analysis & reverse engineering to uncover any emerging threats.”

Screenshot of IAS technical documentation, describing IAS Ad Fraud Solution; Source: https://go.integralads.com/rs/469-VBI-606/images/IAS_Xandr_User_Guide.pdf; Archived: https://perma.cc/93NM-2S2Q

According to IAS’ public documentation, IAS offers a “Quality Sync” Pre-bid segment which “automatically matches your pre- and post-bid settings” including “Ad Fraud (IVT)” settings.

Screenshot of IAS technical documentation, describing IAS Quality Sync Pre-Bid Segments; Source: https://go.integralads.com/rs/469-VBI-606/images/IAS_Xandr_User_Guide.pdf; Archived: https://perma.cc/93NM-2S2Q

It appears that hundreds of major brands whose ads’ source code include references to “charge-allIntegralQualitySync” or “charge-allIntegralSuspiciousActivity” have had their ads served to bots in data centers in the time period ranging from 2017 to 2024. In some cases, brands whose ads included these references were seen as having their ads served to declared bots operating out of known data center IP addresses whose user agent has been on the IAB Tech Lab’s Bots and Spiders reference list since 2013.

For example, the Government of DC (the US Capital) had ads served to declared bots on the IAB Bots List operating out of known data center IPs, wherein the DC Government’s ads’ source code included references to “charge-allIntegralSuspiciousActivity”. For example, the DC Government had its ads served to a declared bot operating out of a known data center IP in November 2023 on the websites jaogata.exblog.jp, maitresseariane.eklablog.com, and worthingcourtblog.com. The DC Government ads were transacted by the Trade Desk Open Path, Index Exchange and Sovrn (f/k/a Federated Media).

As a second example, the British healthcare company Haleon had ads served to declared bots on the IAB Bots List operating out of known data center IPs, wherein the Haleon ads’ source code included references to “charge-allIntegralSuspiciousActivity”. For example, Haleon had its advil.com ads served to a declared bot operating out of a known data center IP in November 2023 on the websites themakerista.com, strengthandsunshine.com, and worthingcourtblog.com. The Haleon Advil ads were transacted by Kargo and Mediagrid.

As a third example, Hershey’s had ads served to declared bots on the IAB Bots List operating out of known data center IPs, wherein the Hershey's ads’ source code included references to “charge-allIntegralQualitySync”. For example, Hershey’s had its ads served to a declared bot operating out of a known data center IP in March 2024 on the websites bordercolliehealth.com, isavea2z.com, and amyromeu.com. The Hershey's ads that were served to declared bots in data centers were transacted by Trade Desk Open Path, Sharethrough, Index Exchange, GumGum, Yieldmo, Kargo, TrustX, Microsoft Xandr, OpenX, and Teads.

As a fourth example, Kenvue (f/k/a Johnson & Johnson Consumer Health) had ads for Listerine, Tylenol, and Imodium served to declared bots on the IAB Bots List operating out of known data center IPs, wherein the source code for Kenvue’s ads’ included references to “charge-allIntegralQualitySync”. For example, Kenvue had its ads served to a declared bot operating out of a known data center IP in July 2024 on the websites pineapplepaperco.com, thegonegoat.com, and lakelandmom.com. The Kenvue ads that were served to declared bots in data centers were transacted by Index Exchange, ShareThrough, and Reset Digital.

The list of brands whose ads’ source code includes references to “charge-allIntegralSuspiciousActivity” or “charge-allIntegralQualitySync”, and had their ads served to a declared bot on the IAB Bots List operating out of a known data center IP address included many major brands such as Hersheys, Coca-Cola, Walmart, Starbucks, Kimberly Clark, Liberty Mutual, Nestle, Samsung, Haleon, Cox Communications, State Farm, the Government of DC, UPS, Enterprise car rental, Kenvue, Energizer, CVS, JPMorgan Chase, GSK, Patagonia, Mars, Walgreens, Audi USA, Weight Watchers, USPS, Cigna, Thomson Reuters, and Discover.

A sample list of the brands whose ads were served to declared bots on the IAB Tech Lab’s Bots & Spiders list operating out of known data center IP address, and whose ads included references to “charge-allIntegralSuspiciousActivity” or “charge-allIntegralQualitySync” is shown below.

Secondarily, many brands whose ads’ source code includes references to “charge-allIntegralSuspiciousActivity” and “charge-allIntegralQualitySync” appear to have had their ads served to URLScan.io’s bots.

As an example, in the screenshot below one can see two ads for Starbucks that were served to URLScan.io’s bot whilst the bot was crawling blackthen.com from a data center on February 17th, 2023. The Starbucks ads - which appear to have been transacted by Publicis media agency, Trade Desk, and Yieldmo SSP - include references to “charge-allIntegralSuspiciousActivity” and code from IAS.

Two Starbucks ads transacted by Publicis, Trade Desk, and Yieldmo to a bot in a data center. The source code of the Starbucks ads include references to “charge-allIntegralSuspiciousActivity” and source code from IAS. Source: https://urlscan.io/result/0bd02107-d66a-4675-a45e-62039ae34667/

As another example, in the screenshot below one can see an ad for JPMorgan Chase that was served to URLScan.io’s bot whilst the bot was crawling blackthen.com from a data center on October 31st, 2023. The JPMorgan Chase ad appears to include references to “charge-allIntegralSuspiciousActivity” and code from IAS.

URLScan.io’s bot 🤖 with no pre-existing user IDs crawled msn.com from a data center on October 31st, 2023. JPMorgan private bank ad served to a bot 🤖 with no user IDs in a data center. The ad was served by Trade Desk and Microsoft Xandr (via PMP deal ID “1691758”), both of whom claim to partner with HUMAN Security. The source code of the JPMorgan bank ad that was served to a bot 🤖 in a data center includes references to “charge-allIntegralSuspiciousActivity” and code from IAS.

Source: https://urlscan.io/result/8444dbdd-0ea0-477c-9c30-b6cc664e4402/

As another example, in the screenshot below one can see an ad for the New York State Energy Research and Development Authority (NYSERDA), a New York state government agency, that was served to URLScan.io’s bot whilst the bot was crawling abandonedamerica.us from a data center on September 10th, 2022. The New York state government ad appears to include references to “charge-allIntegralSuspiciousActivity” and code from IAS.

URLScan.io’s bot 🤖 with no pre-existing user IDs crawled abandonedamerica.us from a data center on September 10th, 2022. Ad served by Amazon to a bot 🤖. New York State Energy Research and Development Authority (NYSERDA) ad served to a bot with no user IDs in a data center. The ad was served by Trade Desk and OpenX SSP, both of whom claim to partner with HUMAN Security. The source code of the New York state government that was served to a bot in a data center includes references to “charge-allIntegralSuspiciousActivity” and code from IAS.

Source: https://urlscan.io/result/ed6aa5e6-8588-434e-9e36-348c23a94705/

As another example, an ad for Coca-cola was served to URLScan.io’s bot whilst the bot was crawling yume2kki.fandom.com from a data center on November 7th, 2023. The Coca-cola ad appears to include references to “charge-allIntegralSuspiciousActivity” and code from IAS.

URLScan.io’s bot 🤖 with no pre-existing user IDs crawled yume2kki.fandom.com from a data center in Frankfurt, Germany 🇩🇪 on November 7th, 2023. Novo Nordisk Ozempic pharmaceutical ad that may have been intended for a consumer in USA 🇺🇸 that was served to a bot with no user IDs in a data center in Germany 🇩🇪. The ad was served by Trade Desk and Pubmatic SSP (via PMP deal ID “PM-CXIX-9328”), both of whom claim to partner with HUMAN Security. The ad tech vendors may have inferred the bot's geolocation as being in Delaware 🇺🇸 - possibly due to an outdated ip-to-geolocation lookup database. The source code of the Novo Nordisk Ozempic ad that was served to a bot in a German data center includes references to “charge-allDoubleVerifyBotAvoidance” and code from DoubleVerify. Coca-Cola ad that may have been intended for a consumer in USA 🇺🇸 that was served to a bot with no user IDs in a data center in Germany 🇩🇪. The ad was served by Trade Desk and Magnite SSP, both of whom claim to partner with HUMAN Security. The ad tech vendors may have inferred the bot's geolocation as being in Delaware 🇺🇸 - possibly due to an outdated ip-to-geolocation lookup database. The source code of the Coca-Cola ad that was served to a bot in a German data center includes references to “charge-allIntegralSuspiciousActivity” and code from IAS.

Source: https://urlscan.io/result/3c636fea-fef8-49d2-9220-287d95ac0625/

As another example, an ad for Visa was served to URLScan.io’s bot whilst the bot was crawling msn.com from a data center on March 7th, 2023. The Visa ad appears to include references to Publicis Apex, and to “charge-allIntegralSuspiciousActivity” and code from IAS.

URLScan.io’s bot 🤖 with no pre-existing user IDs crawled msn.com from a data center in Frankfurt, Germany 🇩🇪 on March 7th, 2023. Visa ad - which includes references to Publicis and Apex- served to a bot with no user IDs in a data center. The ad was served by Trade Desk and Triplelift SSP, both of whom claim to partner with HUMAN Security. The source code of the Visa ad that includes references to Publicis and Apex and was served to a bot in a German data center includes references to “charge-allIntegralSuspiciousActivity” and code from IAS. The Visa ad source code includes references to “utm_term=mobile-retargeting”.

Source: https://urlscan.io/result/2198a65c-4140-4d53-bc66-71f28895d30f/#summary

As another example, an ad for the United States Postal Service (USPS) was served to URLScan.io’s bot whilst the bot was crawling legit.ng from a data center on May 22nd, 2023. The USPS ad appears to include references to “charge-allIntegralSuspiciousActivity” and code from IAS.

URLScan.io’s bot 🤖 with no pre-existing user IDs crawled legit.ng from a data center on May 22nd, 2023. Ukrainian government ad (UNITED24 - The initiative of the President of Ukraine or u24.gov.ua) ad being served to a bot 🤖 in data center by Google. Two United States Postal Services (USPS) ads served to a bot 🤖 with no user IDs in a data center. The ad was served by Trade Desk, Google AdX, and Sharethrough. Trade Desk and Sharethrough claim to partner with HUMAN Security. Only one of the two USPS ads is visible above-the-fold in the bot’s screenshot. The source code of the USPS ad that was served to a bot 🤖 in a data center includes references to “charge-allIntegralSuspiciousActivity” and code from IAS.

Source: https://urlscan.io/result/a2f1888e-3af0-4d69-863e-0e26abe75ef1/

As another example, a muted, auto-playing, out-stream video ad for Coca-cola was served to URLScan.io’s bot whilst the bot was crawling romewise.com from a data center on February 6th, 2024. The Coca-cola ad appears to include references to “charge-alIntegralQualitySync”, “charge-allIntegralVideoViewability”, and “charge-allQAVideoCompletionRate”, and code from IAS.

A Coca-Cola video ad that was served to URLScan.io’s bot. The source code of the Coca-Cola ad appears to contain references to “charge-allIntegralQualitySync”. Source: https://urlscan.io/result/a220e58c-3aa6-480a-b551-1376724c59e0/#summary

As another example, a muted, auto-playing, out-stream video ad for Hershey's Rolo candy was served to URLScan.io’s bot whilst the bot was crawling chefjulieyoon.com from a data center in Australia on September 18th, 2023. The Hershey’s ad appears to include references to “charge-alIntegralQualitySync” and code from IAS.

A Hershey’s Rolo candy video ad that was served to URLScan.io’s bot. The source code of the Hershey’s ad appears to contain references to “charge-allIntegralQualitySync”. Source: https://urlscan.io/result/84886d76-1699-4d22-a0f9-e4a7d6f53bcf/#summary

As another example, two ads for Microsoft Copilot AI were served to URLScan.io’s bot whilst the bot was crawling economist.com from a data center on December 14th, 2024. The Microsoft ads appear to include references to “charge-alIntegralQualitySync” and code from IAS.

Screenshot of two Microsoft ads served to URLScan.io’s bot. The source code of the Microsoft ads includes references to “charge-allIntegralQualitySync” and code from IAS.

Source: https://urlscan.io/result/f10fb0fa-ea19-4f03-b060-0e773dfebd18/#summary

A sample list of the brands whose ads were served to URLScan.io’s bots and whose ads included references to “charge-allIntegralSuspiciousActivity” or “charge-allIntegralQualitySync” is shown below.

Research Results: YouTube TrueView ads served to bots on Google Video Partner sites

According to Google’s public online documentation, “Google video partners are high-quality publisher websites and mobile apps where you can show your video ads to viewers beyond YouTube.” “Video partner publishers are carefully vetted and must meet Google's inventory quality standards. Video partner publishers also need to follow the policies applicable to their ad management platform (which can include Google Ad Manager, AdMob, or AdSense). Our Video Ad Safety Promise, in which certain types of content can't be monetized for ads, applies to both YouTube and Google video partners.”

TrueView was Google's “proprietary cost-per-view, choice-based ad format that serves on YouTube, millions of apps, and across the web.” With TrueView, advertisers only pay “for actual views of their ads, rather than impressions.” TrueView asks users if they want to skip the video ad after 5 seconds with a visual prompt. Google’s policies stated that TrueView ads must be skippable, audible, and playing of the video (and ad) cannot be solely initiated by passive user scrolling.

According to another piece of Google’s documentation on “Outstream video ad[s]”, outstream video ads “ serve on partner sites and apps outside of YouTube”, and “the ads start with the sound off and the user can tap the ad to unmute it.”

One can observe many instances where TrueView video ads from various advertisers were served to bots rather than humans, including in environments where those ads were served entirely muted (with sound off) and without the user clicking on the video to initiate media playback.

For example, January 2nd, 2024 one can see that a TrueView video ad for the Timberlyne Group was served to a bot operating out of a datacenter in a muted, auto-playing video player. The video ad was served to a bot that was crawling wsj.com.The video ad is fully muted and autoplays without user initiation, when served to the bot.

The IP address where the bot was operating out of IP address 185.172.52.56, when served the Timberlyne video ad and this IP address is owned by PacketHub S.A. and is hosted in St Louis, US. This IP belongs to datacenter infrastructure.

Screenshot of a Timberlyne Group YouTube TrueView video ad served to a bot in a datacenter on a muted, auto-playing video player on January 2nd, 2024. Source: https://urlscan.io/result/b67f04a5-6f45-42c5-b63c-c4160b4a53ec/

As another example, on October 25th, 2022, one can see that a TrueView video ad for the Mike Lee for Senate was served to a bot operating out of a datacenter in a muted, auto-playing video player. The video ad was served to a bot that was crawling theflatbkny.com.The video ad is fully muted and autoplays without user initiation, when served to the bot.

Screenshot of a Sen. Mike Lee for Senate YouTube TrueView video ad served to a bot in a datacenter on a muted, auto-playing video player on October 25th, 2022. Source: https://urlscan.io/result/7f159e54-302f-4da6-9f67-e9313eef2fe4/

Research Results: Google’s publisher ad server appears to have served millions of impressions from thousands of advertisers to bots, including declared bots operating out of Google’s own data centers

Many of the websites observed serving ads to bots for years appear to be utilizing Google Ad Manager (GAM) as their publisher ad server. This can be deduced by the presence of specific HTTPS requests and by the presence of Google Publisher Tag on their pages. The Google Publisher Tag (GPT) is an ad tagging library for Google Ad Manager. A future report will analyze the prevalence of ads being served by Google Ad Manager to declared bots in known data centers.

According to Check My Ads Institute (CMAI), which observed the United States vs. Google LLC trial proceedings in September, 2024, Per Bjorke (Director of Product Management, AdSpam Team at Google) commented under oath about Google’s bot and invalid traffic filtration systems.

According to Check My Ads’ recounting:

“[Bjorke] talks about all Google does to combat Invalid Traffic and ad fraud, and Google’s ~$250M investment in the area in ‘22. He explains how IVT matters to advertisers, as they wouldn’t want to pay for traffic from bots sitting in a data center somewhere. He speaks to the publisher vetting process, and how keeping bad actors out of the AdX system is important, so Google doesn’t need to battle them later on. We hear about how they have a refund and claw-back process to make sure advertisers don’t pay for and publishers aren’t paid for IVT. Google points to AWbid to try to show how fraud and spam was more prevalent on third party exchanges, and Bjorke explains why it is much easier to contain in a closed system.”

Research Results: Hundreds of thousands of United States government healthcare.gov ads were served to bots operating out of Google’s own data centers

Hundreds of thousands of healthcare.gov ads (Department of Health and Human Services) appear to have been served for years to declared bots operating out of Google Cloud data center IP addresses. Specifically. healthcare.gov ads were served to the HTTP Archive bot from 2022 to 2024. The bot is a declared bot, whose user agent is on the IAB Bots and Spiders List.

For example, the following healthcare.gov ad creative was served at least 41,141 times to the declared HTTP Archive bot:

Source: https://s0.2mdn.net/simgad/9542029637209407237

A similar healthcare.gov ad was served 33,107 times to HTTP Archive’s bot between 2022 and 2024.

Source: https://s0.2mdn.net/simgad/8208861446627735063

The following healthcare.gov ad was served 28,702 times to HTTP Archive’s declared bot operating out of a Google Cloud data center.

Source: https://s0.2mdn.net/simgad/1008250947654517080

The following healthcare.gov ad was served 23,479 times to HTTP Archive’s declared bot operating out of a Google Cloud data center.

Source: https://s0.2mdn.net/simgad/15936844823979823746

The following healthcare.gov ad was served 12,550 times to HTTP Archive’s declared bot operating out of a Google Cloud data center IP address.

Source: https://s0.2mdn.net/simgad/8871931525062578142

Research Results: Publishers who were seen consistently NOT serving ads to HTTP Archive’s declared bot

As described in the methodology section towards the beginning of this research report, this study sourced data from several distinct bot and crawler datasets. One of those datasets was the HTTP Archive, which utilizes a declared bot operating out of Google Cloud data center IP addresses to crawl several million websites each month. The HTTP Archive bot’s bot is a declared bot - because it specifically communicates its status as a bot via the User-Agent header.

Screenshot from HTTP Archive’s public documentation page, showing the exact declared bot user agent used by the bot. Source: https://almanac.httparchive.org/en/2022/methodology

The User-Agent header is an HTTP header intended to identify the user agent responsible for making a given HTTP request. When a person browses the web via a browser on their phone rather than on the desktop, their phone’s browser declares via the HTTP header that the user is browsing the web on a mobile browser (rather than a desktop browser).

Similarly, when various “good” bots, such as GoogleBot or BingBot crawl the open internet, those bots openly declare themselves via the HTTP User-Agent header. If a website does not want to appear in Google or Bing search results, they can choose to block GoogleBot or Bingbot via their web hosting provider.

The HTTP Archive bot’s user agent has been on the IAB Tech Lab’s Bots and Spiders reference list since 2013.

According to the IAB Tech Lab, “The IAB Tech Lab publishes a comprehensive list of such Spiders and Robots that helps companies identify automated traffic such as search engine crawlers, monitoring tools, and other non-human traffic that they don’t want included in their analytics and billable counts [...] The IAB Tech Lab Spiders and Robots provides the industry two main purposes. First, the spiders and robots list consists of two text files: one for valid browsers or user agents and one for known robots. These lists are intended to be used together to comply with the “dual pass” approach to filtering as defined in the IAB’s Ad Impression Measurement Guidelines (i.e., identify valid transactions using the valid browser list and then filter/remove invalid transactions using the known robots list). Second, the spiders and robots list supports the MRC’s General Invalid Traffic Detection and Filtration Standard by providing a common industry resource and list for facilitating IVT detection and filtration.”

According to IAB Europe, “Traffic consisting of robotic user agents is non-human rather than fraudulent; the technology exists to be helpful to internet users and has not been ‘disguised’ as a human user. This means that they can be easily detected and filtered by implementing the Spiders & Robots list” (emphasis added).

Source: https://iabeurope.eu/international-iababc-spiders-and-bots-list/

According to the Audit Bureau of Circulations (ABC UK), “The IAB/ABC International Spiders & Bots list helps filter known, non-human activity that can significantly inflate ad impression and site traffic counts. Effective use of this data is a requirement of various standards globally, and can lead to a more transparent and accurate measurement for you or your clients.

Source: https://www.abc.org.uk/assurance/bots-and-spiders

As previously documented, this study observed many instances where ad tech vendors appeared to serve ads to declared bots whose user agent is on the IAB Tech Lab’s Bots list.

Similarly, many different digital publishers and websites were observed serving digital ads to declared bots on IAB Tech Lab’s bots list. This includes many publishers who appear to utilize IAS and DoubleVerify’s pre-bid publisher optimization tools on their pages, such as washingtonpost.com, usatoday.com, weather.com, cnn.com, fandom.com, and wired.com.

For example, nytimes.com was observed serving ads for Jaeger-LeCoultre whilst the declared HTTP Archive bot was crawling the website in May 2022.

Weather.com - which currently appears to use DoubleVerify's publisher optimization tool and previously appeared to use IAS's publisher optimization tool - was observed serving ads for Samsung, Haleon Flonase, Togo's restaurants, Brighthouse Financial, Unilever Murad, American Airlines, Procter & Gamble Bounty Towels, DC Water, and Progressive insurance whilst the HTTP Archive bot was crawling the website in 2022, 2023, and/or 2024.

Usatoday.com - which appears to use IAS' publisher optimization tool on its webpages - was observed serving ads for JPMorgan Chase, T-Mobile, the Government of Utah, and Bank of America to the HTTP Archive bot whilst the bot was crawling the website in 2022, 2023, and/or 2024.

Warner Bros. Discovery operated CNN.com - which appears to use IAS' publisher optimization tool on its webpages - was observed serving ads for Jaeger-LeCoultre and Vista Paint to the HTTP Archive bot whilst the bot was crawling the website in 2022.

Conde Nast operated Wired.com - which appears to use IAS' publisher optimization tool on its webpages - was observed serving ads for T-Mobile to the HTTP Archive bot whilst the bot was crawling the website in February 2022.

Fandom.com - which appears to use both IAS' and DoubleVerify’s publisher optimization tool on its webpages - was observed serving ads for T-Mobile, US Bank, Progressive, Apple, and the Virginia College Savings Plan (Virginia529) to the HTTP Archive bot whilst the bot crawling the website in 2022-2024.

However, analyzing years of HTTP Archive data reveals a notable “absence” of ad delivery to the declared HTTP Archive bot on a few websites. For example, there are a few websites that have been regularly crawled by the HTTP Archive bot, yet for which there appears to be no records of ads being served to that bot on those specific pages.

One can use a standard chrome browser and chrome developer tools’ Network conditions setting or a custom browser extension to modify the declared browser user agent declared by the browser. For example, if a consumer is using Chrome on Windows, one can use a browser extension to make it appear as though the device is GoogleBot, BingBot, or another type of crawler. With this capability, one can emulate the exact user agent used by HTTP Archive’s bot when it is crawling the web.

Emulating the HTTP Archive declared bot user agent via Chrome developer tools shows that some websites consistently and completely block any ads from being served to a user declaring themselves to be a bot.

For example, in the screenshot below, one can see that if a user declares themselves to be the user agent that is used by HTTP Archive bot, Reuters’ website blocks the user and does not show any ads whatsoever.

Reuters.com blocks the user if the user is a declared bot.

Similarly, the website of the Wall Street Journal wsj.com also blocks the user if the user declares themself to be a bot via the HTTP User-Agent request header.

The Wall Street Journal’s website wsj.com blocks the user if the user is a declared bot.

Many websites owned or operated by a media company called the Arena Group also appear to block users who declare themselves to be a bot via the HTTP User Agent request header. For example, in the screenshots below, one can see how the websites parade.com, autoblog.com, pethelpful.com, athlonsports.com, thestreet.com, mensjournal.com, and surfer.com all appear to block the user and do not serve any ads if the user is a declared bot like the HTTP Archive bot.

Arena Group managed parade.com blocks the user and does not serve any ads if the user declares themself to be a bot via the HTTP User Agent request header.

Arena Group managed autoblog.com blocks the user and does not serve any ads if the user declares themself to be a bot via the HTTP User Agent request header.

Arena Group managed pethelpful.com blocks the user and does not serve any ads if the user declares themself to be a bot via the HTTP User Agent request header.

Arena Group owned athlonsports.com blocks the user and does not serve any ads if the user declares themself to be a bot via the HTTP User Agent request header.

Arena Group managed thestreet.com blocks the user and does not serve any ads if the user declares themself to be a bot via the HTTP User Agent request header.

Bloomberg.com also appears to create a challenge that the user must solve if they declare themselves to be a bot, and do not appear to serve ads initially to the user.

Bloomberg.com blocks and challenges the user and does not serve any ads if the user declares themself to be a bot via the HTTP User Agent request header.

One can contrast this difference in behavior by looking at what happens when a user declares themself to be a bot via the HTTP user-agent headers and visits websites such as washingtonpost.com, nytimes.com, weather.com, si.com, cafedlites.com, usatoday.com, weather.com, or fandom.com. As mentioned before, many of these websites were observed serving ads to declared bots such as the HTTP Archive bot.

For example, in the screenshot below, one can see that washingtonpost.com serves a Google Gemini AI ad to the user when the user’s declared User Agent is a bot on the IAB Tech Labs Bots & Spiders list.

washingtonpost.com appears to serve ads even if the user declares themself to be a bot via the HTTP User Agent request header.

nytimes.com.com appears to serve ads even if the user declares themself to be a bot via the HTTP User Agent request header.

weather.com appears to serve ads even if the user declares themself to be a bot via the HTTP User Agent request header.

Minute Media operated si.com appears to serve ads even if the user declares themself to be a bot via the HTTP User Agent request header.

fandom.com appears to serve ads even if the user declares themself to be a bot via the HTTP User Agent request header. Fandom uses IAS and DoubleVerify publisher optimization tools on its pages.

Gannett Media operated usatoday.com appears to serve ads even if the user declares themself to be a bot via the HTTP User Agent request header.

cafedelites.com appears to serve ads even if the user declares themself to be a bot via the HTTP User Agent request header. a P&G Vicks video ad can be seen in the screenshot.

Prompted by these observations, Adalytics reached out to several of the media publishers who appear to be filtering out declared bot traffic. Adalytics asked them for comments about how and why they are filtering out declared bot traffic, or preventing ads from serving to declared bots such as the HTTP Archive bot.

Several of the media publishers declined to respond or comment. However one publisher - the Arena Group, agreed to provide comments “on the record” that could be cited in this report.

The questions and answers can be seen below. The bolded text at the top represents questions that Adalytics asked, and the non-bolded text below contains the publisher’s responses.

  1. What is Arena Group’s position on serving ads to bots?

    • “We believe it’s inappropriate to serve ads to bots. Therefore, we employ a significant amount of resources to ensure our traffic is high-quality. Advertisers chose to work with Arena because of our reach, influence and performance. Serving ads to bots is a betrayal of that trust and would only degrade our performance and relationships.”

  2. Do publishers have any responsibility in ensuring ads do not serve to bots, or should the responsibility lie with SSPs, DSPs, media agencies or brands’ verification vendors?

    • “Dealing with bots and other types of fraud is the responsibility of every company in the advertising supply chain. No single solution is perfect, but, for us, it’s very important to ensure our traffic is authentic.Therefore, we take additional steps to combat bot traffic and don’t rely solely on one partner.”

  3. What is the “opportunity cost” for a publisher of forgoing serving ads to declared bots and data center-originating web traffic?

    • “In the long term, there is no real cost. While it may seem appealing for a publisher to show ads to bots and, therefore, drive traffic and revenue in the short term, it actually is a bad long-term decision. It creates noise or bad data, thereby potentially guiding us in the wrong direction. It also reduces trust with our advertising partners. When we build strategies based on our real traffic and revenue trends, we know we’re making better decisions and providing real value for our partners for the long term.”

  4. For a publisher, what are the trade-offs when considering whether or not to allow ads to serve to bots on the publishers’ properties?

    • “As mentioned, allowing ads to be served to bot traffic could enable additional short-term revenue, but not without significant downsides, not the least of which is creating a lot of noise in our data about consumer and revenue trends, but also decreased value for our advertisers. If advertisers can’t trust us, they won’t work with us.”

  5. How much bot traffic and data center-originated traffic does a publisher see on its properties? For example, do 5%, 10%, 15% of page visits come from bots?

    • “25-30% range and it can be as high as 40+%”

Conclusion

Caveats & Limitations

Interpreting the results of this observational study requires nuance and caution.

The digital media supply chain has numerous stakeholders involved in transacting an ad. When a digital ad is delivered on a website, it can involve a publisher, one or more supply side platforms or ad exchanges, one or more ad servers, one or more verification vendors, a demand side platform, and/or a media agency. Thus, it can be difficult to determine or infer who was “responsible” or made a “decision” that resulted in an ad being served to a bot. Even if a vendor has correctly identified a user as a bot, sometimes other stakeholders may intentionally or inadvertently still authorize an ad to be served to that bot.

For example, a vendor may correctly identify a bot, and pass that information to another vendor who controls the ad serving or bidding decisioning process. That second vendor - for a number of reasons - may disregard, ignore, or fail to action the accurate and correct bot classification returned by the first vendor.

One should not assume that because a given ad tech vendor or vendors transacted a given ad to a bot that those vendors are somehow responsible or “at fault” for the ad being served to a bot. It is possible that a different stakeholder in the media supply chain ignored or accidentally ‘over-rode’ the recommendations or data points provided by other ad tech vendors, or that limited information was passed between vendors in the media supply chain.

The Results section of this report is based entirely on publicly available data obtained on the open web through open source intelligence (OSINT) techniques. None of the data, observations or analysis included in the Results section of this report came from privileged, non-public, or proprietary sources.

Readers should be discerning and careful not to conflate distinct sets of observations or draw inferences about causality, intent, quantitative impact, magnitude, or provenance.

For avoidance of doubt, this report makes no assertions about the above. This report makes no assertions about the intent behind certain media buying practices, and whether or not it was done with specific parties’ authorization. This report makes no assertions or claims with regards to the quantitative magnitude of some of the phenomena observed in this study. This observational research study does not make any assertions regarding causality, provenance, intent, quantitative impact, or relative abundance.

This study does not make any recommendations to media buyers with regards to whether or not to transact with specific ad vendors or with specific publishers. The study is intended solely to present a set of public observations, so that readers can come to their own conclusions and make their own informed decisions.

Furthermore, this report is based on publicly accessible, client side forensics/strings. Those forensics can yield false positives, for example, in situations where an “adomain” or ad clickthrough URL destination is not indicative of who paid for a given ad placement.

Caveats & Limitations - Questions & Answers (Q&A)

  1. Does this study make any assertions with regards to quantitative impact?

No. This study does not make any assertions with regards to quantitative impact. For example, this study does not quantify how many total ads were served via specific vendors to bots, or the absolute prevalence of bot traffic in digital advertising.

  1. Does this study recommend that media buyers exclude or block specific publishers or intermediaries from future ad buys?

No. This study does not recommend or advise any media buyers to exclude or block specific vendors, intermediaries, or sellers from future ad buys. Each media buyer may wish to review their own ad buy records, their media buying requirements, and make their own informed decisions with regards to whether or not to continue to transact with any specific media vendor or intermediary.

  1. Does this study assert that any specific verification vendors, media agencies, DSPs, SSPs, vendors, intermediaries, suppliers, or sellers are “at fault” for specific potentially un-desired ad transactions?

No. This observational research study makes no assertions with regards to fault, intent, or causality. In many cases it is not possible to fully attribute why a given ad was served, and there may be multiple different stakeholders involved. For example, many programmatic ads are transacted by a publisher that operates a website, a supply side platform, a demand side platform, a media agency, a publisher ad server, an advertiser ad server, and a verification vendor. One or more of these entities may have a role to play in why an ad was served at a given time.

Moreover, some of the vendors mentioned in this post provide limited information about how their solutions function, and no information about how their solutions are supposed to work in a transaction where every various parties are using some form of “filtration” by two different vendors.

Discussion

In 2018, two US Senators wrote a letter in 2018 to the Federal Trade Commission (FTC) “to express frustration with the growing phenomenon of digital ad fraud” and asking the FTC to investigate:

the extent to which major ecosystem stakeholders engage in willful blindness to fraudulent activity in the online ad market.

The Financial Times previously cited Mikko Kotila, a computer scientist and co-author of a landmark World Federation of Advertisers (WFA) report on ad fraud, as saying:

“Mr Kotila says middlemen in the digital media supply chain often turn a blind eye to fake traffic and other skulduggery because doing so is in their financial interest. “They are either ignorant of the fraud or they are an accomplice to it,” he says.”

The Financial Times previously cited Shailin Dhar, an ad tech industry expert, saying:

“Mr Dhar remains sceptical about how far the industry will get in its fraud-fighting efforts, given that many constituents have a financial incentive to maintain the status quo. “Ad tech companies have made billions of dollars a year from fraudulent traffic,” he says. “Fraud is built into the foundation of advertising supply.””

Shailin Dhar previously commented in a Linkedin post: “

“Advertisers, Why do we spend our efforts chasing “super sophisticated botnets” operated by the worlds “most devious cybercriminals”, when we haven’t stopped basic data-center/server-farms from eating up ad budgets?”

Source: https://www.linkedin.com/pulse/marketers-stop-distracting-yourself-focus-ad-fraud/

The Association of National Advertisers (ANA) previously stated that:

A false sense of security enables fraud to thrive.

This research study documents how thousands of brands - including the United States federal government and military - had their ads served to bots. Some of the bots are openly declared bots whose user agent has been on the advertising industry’s reference bot list since 2013, and were operating out of well known data center IP addresses.

Screenshot of a quote from P&G Chief Brand Officer Marc Pritchard on the TAG website

Many of the ad tech vendors which appear to serve ads to bots claim to partner with vendors such as HUMAN Security. Some of these vendors made public claims that they are 100% of their prospective ad inventory with HUMAN, and ensuring ads are not served to bots. However, many of those ad tech - Trade Desk, Microsoft Xandr, Index Exchange, Google, Sovrn, Yieldmo, Sharethrough, Kargo, GumGum, Triplelift, Magnite, Pubmatic, Sonobi, Freewheel, media.net, Beachfront, Primis, and Omnicom Media Group were serving ads to bots.

Several of these vendors have made public statements about validating 100% of bid request pre-bid, and avoiding bidding on or serving ads to bots. However, this research shows that even vendors who made such public press releases were seen in some cases serving ads to bots.

The Trade Desk had its own product and job recruiting ads served to bots in some cases.

Furthermore, many advertisers and media agencies appear to be utilizing the services of ad verification vendors Integral Ad Science and DoubleVerify, and having some of their ads served to bots. The source code of many brands’ ads included references to “charge-allDoubleVerifyBotAvoidance” and “charge-allIntegralSuspiciousActivity”. For example, the US Navy’s ads included references to “charge-allDoubleVerifyBotAvoidance” and the US Postal Service’s (USPS) ads included references to “charge-allIntegralSuspiciousActivity” when they were observing being served to bots. Some of these advertisers may be specifically paying for bot avoidance pre-bid segments.

All major media agency holding companies - including IPG, Dentsu, Havas, Publicis Groupe, GroupM/WPP, and Omnicom, as well as MiQ and other agencies - were observed as transacting bot traffic. The World Federation of Advertisers (WFA) recommends that “Contracts with agencies and vendor partners need to be revised to ensure that there are clear penalties for misallocating spend to ad fraud related inventory, where preventing it could be reasonably achieved.

Many major media publishers who appear to have installed the IAS and/or DoubleVerify publisher optimization tools on their webpages were also seen serving ads to bots. For example, the Washington Post, Wall Street Journal, CNN, Weather.com, The Guardian, USA Today, Fandom, and Conde Nast were seen serving ads to bots whilst having IAS or DoubleVerify publisher optimization code configured on their pages.

Across the sample of URLScan.io crawls from 2019-2024, IAS’s Publisher Optimization tool labeled the URLScan.io bot as valid, human traffic 77% of the time.

In some cases, it appears that IAS’ publisher optimization tool labeled the same bot and page view session as both valid, human traffic (“fr=false”) and simultaneously as invalid, bot traffic (“fr=true”).

Furthermore, in some cases, even when IAS’ publisher optimization tool identified a given user as a bot, there were still ads served to the bot on behalf of advertisers who appeared to be using IAS’ advertiser-side tools.

On February 16th, 2023, URLScan.io’s bot visited quiltville.blogspot.com, and the IAS publisher optimization tool (via endpoint “pixel.adsafeprotected.com/services/pub”) was invoked.. The IAS API endpoint returned the classification “fr=true”, indicating invalid, bot traffic. An ad for Grande Ole Opry music stage in Nashville, Tennessee was served to the bot by Trade Desk and Magnite. The source code of the Opry ad that was served to a bot after IAS’s tool labeled the page view as “fr=true” indicated “charge-allIntegralSuspiciousActivity”. Source: https://urlscan.io/result/38691aec-a650-4ff6-a05c-9065fd72c938/#summary

The source code of ads from brands such as Progressive Insurance and the Government of Ontario, Canada included references to “Scibids”, whilst those ads were served to bots with no pre-existing user IDs.

Whilst it is ostensibly possible that brands were not charged for some of the ads served to bots, analysis of data provided by advertisers in some cases suggests that the brands were in fact invoiced for ads served to bot traffic.

Some brands and agencies do not get access to high granularity impression level log file data with IP address, user agent, and user ID information. In those cases, it can be difficult if not outright impossible for the media buyers to assess how much they were invoiced for ads served to bots.

However, regardless of the financial impact of the digital ads that are served to bots, these bots can increase advertisers’ Scope3 carbon emissions and lead to increased electricity consumption. Furthermore, the ads delivery can result in skewed or erroneous web analytics data that biases media investment decisions.

The World Federation of Advertisers (WFA) recommends that “brands need to develop in-house expertise to support vendor selection, work with cyber security partners to help understand common threats and demand full transparency of investment”. The Association of National Advertisers (ANA) recommends that advertisers “refuse payment on non-human traffic in media contracts.

Many advertisers - such as the New York City Police, Department of Homeland Security, the United States Navy, Army, Department of Veterans Affairs, HHS, healthcare.gov, CDC, USPS, the Federal Government of Germany, Australian Defence Force, Singaporean Police, Ontario Government, and governments of Utah, California, Virginia, Florida, New York, and Indiana - may benefit from undertaking a closer review of their digital advertising.

Receive future blog posts

Subscribe below to get new articles