algorithm Archives · TechNode

ByteDance’s Douyin reveals algorithm amid government push to tackle online platform issues

Shuang Jing — Tue, 01 Apr 2025 11:14:26 +0000

On March 30, Han Shangyou, president of TikTok’s parent Douyin, announced the launch of the “Douyin Safety and Trust Center” website. The site publicly disclosed the principles behind Douyin’s recommendation algorithm and explained how it predicts user behavior.

Why it matters: Since gaining attention in 2016, Douyin’s unique content recommendation algorithm has been seen as a key factor in its success. However, it has also raised concerns about creating filter bubbles and biases. Over the past nine years, China’s online environment has undergone significant changes, and as one of the country’s most popular social media platforms, Douyin has played a central role in shaping this digital landscape. With the public disclosure of its algorithm, Douyin aims to address the issues it has helped create.

Details: Douyin introduced two models: the Wide&Deep model and the Two-Tower Retrieval Model.

The Wide&Deep model, as its name suggests, combines a single-layer Wide component with a multi-layer Deep component. The Wide component focuses on strengthening the model’s “memorization” ability, which refers to its capacity to learn directly from the co-occurrence frequencies of items or features in historical data. In contrast, the Deep component enhances the model’s “generalization” ability, allowing it to identify correlations between features and uncover links between rare or even unseen features and the final label.
Two-Tower Retrieval Model is used to recommend content to users by converting both user and content features into numbers, like “0” for a cat video and “1” for a dog video. These numbers are fed into two separate deep learning models – one for users and one for content – creating unique “digital fingerprints” for both. The model then compares the user’s fingerprint with those of all available videos, measuring how close they are. The closer the fingerprints, the more likely the video is to be recommended. This approach helps the system match users with videos they might like without needing to understand the actual content, just the numbers.

The website also outlined the review process for governance on the Douyin platform, detailing how it handles challenges related to rumors, online harassment, and other violations.

Context: The actions follow a three-month government campaign, launched in late November, aimed at addressing common algorithmic issues on online platforms, such as filter bubbles, where users are exposed only to content that aligns with their views, and discriminatory pricing practices targeting different demographics. The campaign was led by the Communist Party’s cyberspace affairs commission, the Ministry of Industry and Information Technology, and other relevant government agencies.In January, Douyin Group announced plans to introduce ten measures aimed at increasing transparency on the platform, including efforts to make its algorithms and governance processes more open. On Jan. 8, Li Liang, vice president of Douyin, rejected the idea that the company’s content recommendation algorithms contribute to the creation of filter bubbles. He noted that ByteDance’s algorithms have come under global scrutiny, with specific concerns about filter bubbles and restricted content exposure.

Digital policy experts weigh in on China’s new algorithm regulation

Claudia Vernotti — Tue, 05 Apr 2022 00:02:00 +0000

In mid-March, two weeks after China’s new algorithm regulation came into effect, many of the most popular Chinese apps, including WeChat, Douyin, Weibo, and Taobao, changed their app settings to allow users to turn off algorithm-based recommendation services, complying with the new rule.

Announced last November and brought into effect on March 1, the new regulation, called Provisions on the Administration of Algorithm-Generated Recommendations for Internet Information Services, was jointly issued by four Chinese government agencies: the Cyberspace Administration of China (CAC), the Ministry of Industry and Information Technology (MIIT), the Ministry of Public Security, and the State Administration for Market Regulation (SAMR).

The regulation deals with the very mechanisms governing the provision of online content – algorithm-powered recommendations that lie at the core of the business model of many of the country’s popular internet services, from social media applications to e-commerce sites, delivery apps, and video platforms.

The regulation is also the world’s first attempt by a national regulator to control the possible abuse of algorithmic decisions. It asks companies to notify consumers about the usage of algorithms, provide an opt-out choice, and protect vulnerable groups, such as minors and seniors.

To better assess the relevance of the new rules for the future of the country’s digital economy, its relation to other regulations in and outside of China, and its possible impact on China’s tech industry, I talked to a number of specialists in digital policy.

More regulations on digital economy worldwide

“It is really impressive to see how fast China adopted or revised key legislation to address certain challenges bound to the digital economy,” said Elena Scaramuzzi, Head of Global Research at Brussels-based independent regulatory research company Cullen International.

Scaramuzzi added that China isn’t alone in regulating digital enterprises and behaviors, other countries and regions are looking to do the same, including the EU, Australia, South Korea, Japan, Singapore, and the UK.

There have been considerable efforts to regulate the data economy across the world. A noteworthy example from the EU, said Scaramuzzi, is the European Commission’s proposal to regulate the use of AI, presented in April 2021. “If approved, the new EU rules would introduce bans on AI practices which pose certain risks, such as AI systems that use subliminal techniques to distort a person’s behavior. According to our recently published research on the global policy trends in AI, half of the surveyed economies (Brazil, China, the EU, Germany, the UK, and some US states) have rules today requiring transparency, explainability, and contestability of certain AI-based decisions.”

Scaramuzzi also pointed out that China’s algorithm regulation does not come in a silo. “The regulation refers to several laws of China, including among others the Personal Information Protection Law (PIPL),” she said.

The new algorithm rules borrow the principles of transparency from PIPL as well as user consent by establishing that algorithm-enabled service providers should disclose to the public the algorithm’s basic logic, purpose, and mechanisms, and allow users to switch off such services, introducing a de-facto opt-out system. They also deal with competition issues, with clauses banning, for instance, the use of algorithms to restrict other service providers or to conduct price discrimination.

The broad scope of the new rules – encompassing different angles – was also remarked upon by an industry practitioner I consulted, representing a large Chinese tech group. The person asked not to be named as they don’t have their company’s approval to speak to the media, but they further highlighted the moralistic angle present in the rules. “One can best understand the algorithm rules and China’s latest interventions in the field of privacy, data security, online gaming, and ed-tech by looking at all this under the prism of the common prosperity drive.” Common prosperity is a political-economic campaign launched by the Chinese government in mid last year to lessen widening social inequality.

This social values-focused approach, unique to the Chinese way of regulating the digital economy, can be found in the wordings of the algorithm regulation: asking all algorithm-recommended service providers to “adhere to mainstream values, actively spread positive energy, and promote the positive and good application of algorithms.”

Chinese regulators have indeed appeared to be increasingly sensitive and responsive to public controversies and complaints by consumers in this field. The mistreatment of gig workers by delivery platforms – which has previously triggered widespread criticism – is for instance explicitly addressed in the new algorithm rules. So too is the harm caused to minors by algorithms designed to stimulate addictions or overconsumption.

Display transparency in algorithms

The noteworthiness of the rules goes beyond their broad scope and moralistic intent. In a comment to the Stanford University-funded DigiChina project, Rogier Creemers, University Lecturer of Modern China Studies at the University of Leiden, noted how “the new regulations attempt to impose a regulatory system classified by type of application and level of impact for algorithms.” In essence, the law introduces a sort of “transparency index” for algorithms; service providers capable of influencing public opinion or social mobilization will need to disclose information such as the types and the scope of applications of their algorithms, as well as their self-assessment reports.

To achieve that level of transparency, Chinese regulators asked companies to register their algorithms with a state system, launching a new online portal on the same day the regulations came into effect.

When it comes to regulating the digital economy, it is interesting to note the similarities in approaches across different jurisdictions.

“The new Chinese regulation indicates a point of contact with the European approach to data regulation, one that is based on the principle of transparency and disclosure to the interested parties,” said Francesco Pizzetti, Professor of Constitutional Law at Turin University and Former President of the Italian Data Protection Authority.

“Chinese regulators seem to have recognized the need for users to be informed about the operating rules of the digital services they use on a daily basis. This is particularly important given the historical change we are witnessing towards an overarching digital society, in which automated programs channel the information we receive.”

Algorithm rules’ impact on Chinese tech industry

It may be too soon to forecast the regulation’s impact on China’s tech companies and more generally on the growth of China’s internet sector, which has already been under heavy regulatory pressure in the past year.

Chinese tech companies are facing headwinds on multiple fronts: tightened domestic regulatory scrutiny, a slowing economy, concerns over the possible delisting of US-listed Chinese stocks from foreign markets, and the delicate position Chinese tech groups find themselves in amid Russia’s war in Ukraine. This regulation is likely to add more concerns to their plate.

Angela Huyue Zhang, director of the Center for Chinese Law at the University of Hong Kong, said in an op-ed for Nikkei that she fears China’s attempt to regulate algorithms might hinder the growth of its most creative internet companies, such as ByteDance, the parent of Douyin and its international version TikTok, which both rely on sophisticated recommendation engines.

One can get a flavor of what might happen with the introduction of the opt-out option of recommender systems by looking at the case of Apple in Western markets. When Apple offered iPhone users the option to switch off tracking, 84% of users took it, putting a dent in the ads revenue of apps like Facebook and Instagram.

“If a similarly large share of Chinese consumers opted out of personalization, collecting and using personal data would become much more costly for both platforms and merchants,” Zhang told Project Syndicate in a February interview, adding that the overall trend of tightened data regulation might not only affect China’s consumer internet business but also derail China’s ambition to become an AI superpower.

However, not all share the same view. The unnamed tech industry practitioner I quoted earlier thinks there is room for optimism.

“The new regulatory move will oblige tech companies to engineer a cultural change inside the organization so as to make sure that privacy is embedded, consumers are protected, trust in digital services, and sustainability of business models are ensured. While there might be disruption in the short-term, with platforms having to adjust their internal processes, in the long-term it will be beneficial for the healthy growth of China’s internet ecosystem,” the person said.

While it might be too early to assess the actual effects of the new regulation, it will certainly be closely watched by both the tech industry community and the international legislators’ circle. As hinted at by Professor Pizzetti, the opportunity in front of us lies in the future establishment of shared regulations that can enable the free circulation of data in the digital world.

Toutiao reveals logic behind its algorithms, shows they are serious about filtering “unsavory” content

Timmy Shen — Fri, 12 Jan 2018 08:22:52 +0000

News aggregation platform Jinri Toutiao held a meeting with engineers from other internet companies yesterday to disclose how Toutiao derived its algorithms (in Chinese) and stressed that the machine doesn’t do all the jobs to recommend news for users.

“We constantly redirect, design, monitor, and manage algorithmic models instead of allowing the machine to make all the decisions,” said Toutiao’s senior engineer Cao Huanhuan at the event. Cao also introduced how the team trained the large-scale online recommendation model and highlighted the fact that Toutiao filters out low-quality and explicit content with its AI recognition technique.

Toutiao’s AI news recommendation system has long been the firm’s core asset to acquire a growing user base and pocket advertiser cash. The system is known for recommending and pushing relevant content to its users based on their behaviors and preferences. The firm even creates fake news to train its anti-fake news AI.

However, just two weeks ago, Toutiao and Phoenix News received punishments from the internet watchdog for spreading pornographic materials and publishing news without a proper license. Toutiao, as a consequence, had to suspend updating several news verticals for 24 hours.

In response to the reinforcement of the rules, Toutiao is hiring 2,000 content review editors, preferably Communist Party members, to comb its app for unsavory content. Therefore, Toutiao’s decision to disclose the logic and outline of its algorithms may not seem as absurd as it tries harder to avoid recommending “inappropriate” content to users.

Toutiao is making fake news to train its anti-fake news AI

Frank Hersey — Thu, 07 Dec 2017 02:45:29 +0000

Toutiao’s AI software did not generate this headline, but for the 20 million pieces of content that flow through the platform each day, headline generation and AB testing are just two of the AI services Toutiao uses to get more people tapping.

Speaking to foreign journalists for the first time as head of the Jinri Toutiao AI Lab and vice president of the app’s owner Bytedance, Dr. Ma Wei-Ying talked about the tech that his lab is working on, why it has a bot that generates fake news and what it knows about its users.

Jinri Toutiao is a news recommendation app that is trained and updated in real time on a user’s behavior. Unlike search engines, Ma pointed out, its search function is individual rather than one ranking for everyone.

“This is the democratization of content creation,” said Ma, putting Bytedance in line with other Chinese tech companies that have recently declared themselves as content companies. “Toutiao is becoming a new information platform for people to find information and connect with information. People are using their smartphones not just to access information, but to create information. They don’t need their own website–they can use Toutiao to directly upload and publish the information and content they create.”

The tremendous amount of data generated by users and creators allows the training of neuro-network models. Applying AI to the data gathered is generating a better understanding of the world these users are in. “We are moving from a digital representation of the world to a semantic representation of the world”.

Ma believes the system is going to improve across the board. “Content creation will be fundamentally revolutionized in next few years” as AI allows the “mining of human intelligence to close the feedback loop” of each stage of the lifecycle of content creation, moderation, dissemination, and consumption. Here’s how.

Make fake news to beat fake news

Bytedance has a different approach to tackling fake news: writing it. The AI lab that Ma heads has developed a bot that uses the company’s growing database of real fake news stories to generate its own fake fake news. It then has another bot for detecting fake news which is trained by analyzing its counterpart’s fake feed, and by drawing on a matching database of real news. “One is good at writing, which means this also helps us to advance machine writing, and the other is machine reading. These two can push each other to improve by using the label data and assimilated data through our algorithms,” said Ma.

Ma believes that having two competing algorithms allows them each to improve. Toutiao lets users report what they believe to be fake news and analyzes comments to detect whether they suggest the content might be fake. When the system identifies a piece of fake news that has got through, it will notify all who have read it that they had read something fake.

Bytedance is using this “dual-learning” technique in other ways. It machine translates news from Chinese into English, then has another program to translate that article from English into Chinese to improve both processes. Fake news can also be translated to allow the algorithms to train for Toutiao’s global expansion. Other aspects of global expansion are language-independent, such as video, meaning those algorithms have already been trained on large numbers of Chinese users.

In the future, the culmination of analyzing successful pieces, building a database of popular topics, and developing machine writing will mean Toutiao will be able to automatically generate articles for its readers on their favorite subjects.

Better algorithms, better articles

“We adjust our strategy every week. It’s a constant experiment,” said Ma. The system is monitoring in real time and is also working to predict if a piece of content will be a success. Algorithms offer four headlines to article writers then conduct AB testing to determine which is having the most impact. But not all articles are subject to algorithms due to the computing power involved. Only when a piece starts to gain traction will it get extra help.

Machine learning is used for viral prediction. It compares incoming articles with previous content that has taken off and as the machine learning proves successful, the accuracy of the system increases with constant feedback. Ma acknowledged that care has to be taken to prevent the algorithms from distorting the popularity of particular elements of content or stopping content from new users getting through who have yet to establish a positive profile from the system.

Automated sports commentary

Object recognition in video is also finely developed to fuel more personalization. Bytedance is working on smarter, personalized sports coverage, explained Ma. The current one-feed-fits-all approach will be replaced with a tailored viewing experience when fan data recognizes an interest in, for example, a particular player. Coverage will focus more on that player, with the end goal being a personalized, automated commentary and onscreen captions.

Location, location, location. And time.

Toutiao builds up an idea of users’ lives including their whereabouts and habits. As well as understanding what content the user is interested in, the AI adjusts recommendations based on current and historic location. Ma gave an example of this which shows the sophistication of the tool. Chinese people living in the US, using Toutiao as part of their everyday lives there, are generating a footprint. Then suddenly Chinese New Year comes around and the location changes from the US to somewhere in China. The news may change accordingly there and then, but once the user heads back to the States, the software assumes that the user’s location at Chinese New Year was significant to them, and probably their hometown. Once back in the US, if any news stories crop up in their supposed hometowns, they will show up in the users’ feeds.

Time is used as a gauge for what is appropriate to send. Algorithms work out when a person is busy and so the app will not bombard them with too much content and will save it until they are free. On a larger scale, the data is providing profiles of cities and areas of cities in terms of people’s working habits. On an individual scale, these patterns can suggest what a person’s occupation is, but the data is anonymized. The system generates a user ID per smartphone, made up of a billion factors and which only an algorithm can identify.

Moderation and government relations

In a separate briefing, Bytedance senior vice-president for corporate development Liu Zhen revealed that of the 20 million pieces of content uploaded to Toutiao each day, 90% are machine moderated. Meaning the other 2 million pieces are human-reviewed. Although Toutiao has been working on its moderation for five years, humans are and always will be needed, according to Ma.

“We have a very good communication channel between the company and the government. So far we’ve been working very hard because we are a new platform, a new kind of application exploring a new frontier. Things have been going quite smoothly because the communication channel is very open and very healthy,” said Ma.