Natural Language Processing for Social Media. Diana Inkpen

Natural Language Processing for Social Media - Diana  Inkpen


Скачать книгу
the structure of the social networks. We actually use the term in a more general sense to refer to applications that do intelligent processing of social media texts and meta-data. Some applications could access very large amounts of data; therefore, the algorithms need to be adapted to be able process data (big data) in an online fashion and without necessarily storing all the data.

      This motivated us to give two tutorials: Applications of Social Media Text Analysis at EMNLP 20151 and Natural Language Processing for Social Media at the 29th Canadian Conference on Artificial Intelligence (AI 2016).2 We also organized several workshops on this topic, Semantic Analysis in Social Networks (SASM 2012)3 and Language Analysis in Social Media (LASM 20134 and LASM 20145), in conjunction with conferences organized by the Association for Computational Linguistics6 (ACL, EACL, and NAACL-HLT).

      Our goal was to reflect a wide range of research and results in the analysis of language with implications for fields such as NLP, computational linguistics, sociolinguistics, and psycholinguistics. Our workshops invited original research on all topics related to the analysis of language in social media, including the following topics.

      • What do people talk about on social media?

      • How do they express themselves?

      • Why do they post on social media?

      • How do language and social network properties interact?

      • Natural language processing techniques for social media analysis.

      • Semantic Web/ontologies/domain models to aid in understanding social data.

      • Characterizing participants via linguistic analysis.

      • Language, social media, and human behavior.

      There were several other workshops on similar topics, for example, the Making Sense of Microposts (#Microposts)7 workshop series in conjunction with the World Wide Web Conference 2012–2016. These workshops focused in particular on short informal texts that are published without much effort (such as tweets, Facebook shares, Instagram-like shares, and Google+ messages). There has been another series of workshops on Natural Language Processing for Social Media (SocialNLP) since 2013, with SocialNLP 2017 offered in conjunction with EACL 20178 and IEEE BigData 2017.9

      The intended audience of this book is researchers who are interested in developing tools and applications for automatic analysis of social media texts. We assume that the readers have basic knowledge in the area of natural language processing and machine learning. We hope that this book will help the readers better understand computational linguistics and social media analysis, in particular text mining techniques and NLP applications (such as summarization, localization detection, sentiment and emotion analysis, topic detection, and machine translation) designed specifically for social media texts.

      Atefeh Farzindar and Diana Inkpen

      December 2017

       1 http://www.emnlp2015.org/tutorials/3/3_OptionalAttachment.pdf https://www.cs.cmu.edu/~ark/EMNLP-2015/proceedings/EMNLP-Tutorials/pdf/EMNLP-Tutorials06.pdf

       2 http://aigicrv.org/2016/

       3 https://aclweb.org/anthology/W/W12/#2100

       4 https://aclweb.org/anthology/W/W13/#1100

       5 https://aclweb.org/anthology/W/W14/#1300

       6 http://www.aclweb.org/

       7 http://microposts2016.seas.upenn.edu/

       8 http://eacl2017.org/

       9 http://cci.drexel.edu/bigdata/bigdata2017/

       Acknowledgments

      This book would not have been possible without the hard work of many people. We would like to thank our colleagues at NLP Technologies Inc., the NLP research group at the University of Ottawa, and our students James Webb and Ruining Liu from the University of Southern California. We would like to thank in particular Prof. Stan Szpakowicz from the University of Ottawa for his comments on the draft of the book, and two anonymous reviewers for their useful suggestions for revisions and additions. We thank Prof. Graeme Hirst at the University of Toronto and Michael Morgan from Morgan & Claypool Publishers for their continuous encouragement.

      Atefeh Farzindar and Diana Inkpen

      December 2017

      CHAPTER 1

       Introduction to Social Media Analysis

      Social media is a phenomenon that has recently expanded throughout the world and quickly attracted billions of users. This form of electronic communication through social networking platforms allows users to generate its content and share it in various forms of information, personal words, pictures, audio, and videos. Therefore, social computing is formed as an emerging area of research and development that includes a wide range of topics such as Web semantics, artificial intelligence, natural language processing, network analysis, and Big Data analytics.

      Over the past few years, online social networking sites (Facebook, Twitter, YouTube, Flickr, MySpace, LinkedIn, Metacafe, Vimeo, etc.) have revolutionized the way we communicate with individuals, groups, and communities, and have altered everyday practices [Boyd and Ellison, 2007].

      The broad categories of social media platforms are: content-sharing sites, forums, blogs, and microblogs. On content sharing sites (such as Facebook, Instagram, Foursquare, Flickr, YouTube) people exchange information, messages, photos, videos, or other types of content. On Web user forums (such as StackOverflow, CNET forums, Apple Support) people post specialized information, questions, or answers. Blogs (such as Gizmodo, Mashable, Boing Boing, and many more) allow people to post messages and other content and to share information and opinions. Micro-blogs (such as Twitter, Sina Weibo, Tumblr) are limited to short texts for sharing information and opinions. The modalities of sharing content in order: posts; comments to posts; explicit or implicit connections to build social networks (friend connections, followers, etc.); cross-posts and user linking; social tagging; likes/favorites/starring/voting/rating/etc.; author information; and linking to user profile features.1 In Table 1.1, we list more details about social media platforms and their characteristics and types of content shared [Barbier et al., 2013].

      Social media statistics for January 2014 have shown that Facebook has grown to more than 1 billion active users, adding more than 200 million users in a single year. Statista,2 the world’s largest statistics portal, announced the ranking for social networks based on the number of active users. As presented in Figure 1.1, the ranking shows that Qzone took second place with more than 600 million users. Google+, LinkedIn, and Twitter completed the top 5 with 300 million, 259 million, and 232 million active users, respectively.

image

      Statista also provided the growth trend for both Facebook and LinkedIn, illustrated in Figure 1.2 and Figure 1.3, respectively. Figure 1.2 shows that Facebook, by reaching 845 million users at the end of 2011, totaled 1,228 million users by the end of 2013. As depicted in Figure 1.3, LinkedIn also reached 277 million users by the end of 2013, whereas it only had 145 million users at the end of 2011. Statista also calculated the annual income for both Facebook and LinkedIn, which in 2013 totalled US$7,872 and US$1,528 million, respectively.

Скачать книгу