Algorithmic curation

Algorithm curation is the selection of online media by technologies such as recommender systems and personalized search. Curation entails the selective sharing of online content and recommendations based on inferred interests.^[1] Curation algorithms implement different filter approaches, such as collaborative filtering and content-based filtering. Examples include search engine and social media products such as the Twitter feed, Facebook's News Feed, and Google Personalized Search.^[2]

History

Early algorithmic curation

Online platforms use newsfeed algorithms to determine what content to present to each user.^[3]^[4] The volume of content published on social media platforms created a need for automated filtering, as manual review of all available content by users is not feasible.^[3]^[4] These systems function as a form of gatekeeper, shaping which new material users are exposed to and influencing knowledge, attention, and political exposure.^[4]

Information overload

Early ranking algorithms addressed information overload by surfacing the most recent or most popular posts.^[3] Later systems shifted toward ranking content based on predicted engagement, aiming to increase the time users spend on a platform.^[3] Research has found that these engagement-oriented systems can increase the spread of misinformation and contribute to political polarization as a side effect of optimising for user interaction.^[4]

How algorithm changes users' feeds over time

Algorithmic curation has been found to increase source diversity in some respects while simultaneously reducing the number of external links presented to users, which limits exposure to off-platform content.^[4] Research using agent-based modelling has examined how user behaviour, information quality, and algorithmic design interact with one another over time.^[3]^[4]

Emergence of AI

Platforms increasingly shifted from rule-based ranking systems toward machine-learning and AI-driven approaches, which allow feeds to be personalised at a larger scale and with greater responsiveness to user behaviour.^[3]^[4] For example, X (formerly Twitter) moved away from a chronological feed toward an AI-powered ranking system that personalises content for each user.^[4] These systems are capable of making ranking decisions across volumes of content and user interactions that would not be practical to handle manually.^[4]

Approach

Filter types

Collaborative filtering

Collaborative filtering (CF) methods create recommendations based on a person's usage patterns.^[5] CF predicts a person's preference for an item by matching their interests with those of users who have similar interests.^[5] This process allows for the sharing of ratings between users with similar profiles.^[5] CF is based on patterns of human behaviour rather than machine analysis of content itself.^[5] Users of CF systems rate items they have interacted with, and these ratings form a profile of interests.^[5] The CF system then matches that user with others who have similar profiles, and uses their ratings to generate recommendations.^[5] Collaborative filtering can be applied across various content types including text, images, music, and financial products, and can account for complex attributes such as taste and quality that are difficult to represent explicitly.^[6]

Content-based filtering

Content-based filtering (CBF) builds a user profile to represent the types of items a user has engaged with, based on keywords and attributes used to describe those items.^[6]^[7] Recommendations are generated by presenting items similar to those the user has previously engaged with or is currently viewing.^[7] The CBF method creates a profile for each item based on discrete attributes and features, and then constructs a content-based user profile using a weighted vector of those features derived from items the user has rated, purchased, or interacted with.^[6]^[7] The weights represent the relative importance of each feature, and can be computed using techniques such as Bayesian classifiers, cluster analysis, decision trees, and artificial neural networks, with the goal of estimating the probability that a user will engage with a suggested item.^[6] One application of content-based filtering is Pandora Radio, where users provide an artist, genre, or composer to generate a station, and the system surfaces music with similar attributes.^[6]

Technology

Recommender system

Recommender systems rank and suggest content to users based on a combination of implicit and explicit input.^[8] Implicit signals include time spent viewing or engaging with a specific item.^[8] Explicit signals include actions such as liking posts, saving store pages, reading news articles, or sharing content.^[8]

Personalized search

Personalized search aims to retrieve results most relevant to the user by incorporating contextual factors beyond the explicit query, such as past queries, browsing history, and inferred interests.^[9] Social media platforms such as X (formerly Twitter) and Bluesky generate recommendations based on similar users and the content those users interact with.^[10] Personalized search may also allow users to explicitly filter results by blocking content containing certain phrases or hashtags.^[11] For first-time users without prior history, personalized search may draw on content-based filtering to establish an initial context.^[6] Similar processes are used by search engines and retail platforms to tailor results and product recommendations to individual users.

AI contribution

Artificial intelligence contributes to algorithmic curation through machine-learning models capable of processing large volumes of data.^[12] Techniques such as deep learning and reinforcement learning allow curation algorithms to model user preferences with greater granularity alongside established filtering approaches.^[12] This enables platforms to adjust content rankings rapidly in response to user behaviour.^[12] In social media and streaming contexts, AI-driven systems arrange feeds according to predicted relevance, with the outputs shaped by patterns present in the training data.^[13]

Social media and potential impact

Echo chambers

Social media algorithms, such as those used by X (formerly Twitter), recommend content that the system predicts a user will engage with positively. Content from accounts with differing perspectives is less likely to be surfaced, which may reduce source and topic diversity and contribute to the formation of echo chambers.^[4] For example, Facebook's news feed is designed to surface content aligned with users' prior engagement, which may reinforce existing views.^[14] This dynamic may contribute to filter bubbles, in which users are seldom exposed to content outside their existing interests. Users may further narrow their feeds by actively blocking certain content or accounts.^[4]

Over-representation

A pattern observed across social media platforms is the concentration of algorithmic visibility among a small subset of users. Content from the most active users, those with the largest followings, or those generating the most engagement tends to be surfaced more frequently, meaning a small number of accounts can account for a disproportionate share of what appears in other users' feeds.^[4]