Large Language Models for Rating the Language of Children's Videos on YouTube
By Sumeet Kumar, Ashiqur Khudabukhsh, Mallikarjuna T.
Citation
Kumar, Sumeet., Khudabukhsh, Ashiqur., T., Mallikarjuna. Large Language Models for Rating the Language of Children's Videos on YouTube .
Share:
Abstract
With the rise of YouTube as the primary source of children's entertainment, concerns have been raised about the lack of quality content. The absence of any video quality indicator and no certification process, even for videos with billions of views, aggravates parental worries. To address these concerns, we propose a machine-learning based approach to assess the language quality of children's videos. We use labeled data from a movie rating website (meant for parents to decide on a movie's appropriateness) to train a deep-learning model for rating the language used in YouTube Kids' videos. We further augment the deep-learning model with a Large Language Model (LLM) that generates a text summary stating the reason for the rating and highlighting keywords and phrases inappropriate for children. Using the proposed system, we analyze over 85,000 videos from the top 100 YouTube Kid's channels and compare them against Disney/Pixar movies that are certified for children's viewing. Our analysis reveals that certified movies generally have a lower language rating than YouTube Kid's channels (lower is better), and animations on YouTube usually have lower language ratings than non-animations on YouTube. Our analysis highlights a need for more stringent guidelines for video creators creating children's content.

Sumeet Kumar is an Assistant Professor of Information Systems at the Indian School of Business (ISB). He studies problems at the intersection of technology and society. He is interested in analysing user behaviour, quantifying polarisation on online forums , and finding advertisements disguised as regular content on online platforms. His current focus is on identifying implicit or hidden advertisements in videos posted on children’s platforms such as YouTube Kids.

Additionally, Professor Kumar has conducted research in software design and development, with particular emphasis on user experience. He has investigated the use of mobile phone sensors during emergencies to improve situational awareness. His study on the Wireless Emergency Alerts (WEA) service in the United States addressed several issues of critical importance to emergency alerts effectiveness and adoption. Notably, some of his research recommendations was included in the US Federal Communications Commission (FCC) proposed changes to WEA.

He completed his undergraduate education at Indian Institute of Technology (IIT) Kanpur. He holds two Master’s degrees—in Software Engineering and in Machine Learning--both from Carnegie Mellon University, where he also earned his doctorate degree.

Sumeet Kumar
Sumeet Kumar