Willkommen, schön sind Sie da!
Logo Ex Libris

Social Media Data Mining and Analytics

  • E-Book (pdf)
  • 352 Seiten
(0) Erste Bewertung abgeben
Bewertungen
(0)
(0)
(0)
(0)
(0)
Alle Bewertungen ansehen
Harness the power of social media to predict customer behavior and improve sales Social media is the biggest source of Big Data. B... Weiterlesen
E-Books ganz einfach mit der kostenlosen Ex Libris-Reader-App lesen. Hier erhalten Sie Ihren Download-Link.
CHF 35.00
Download steht sofort bereit
Informationen zu E-Books
E-Books eignen sich auch für mobile Geräte (sehen Sie dazu die Anleitungen).
E-Books von Ex Libris sind mit Adobe DRM kopiergeschützt: Erfahren Sie mehr.
Weitere Informationen finden Sie hier.
Bestellung & Lieferung in eine Filiale möglich

Beschreibung

Harness the power of social media to predict customer behavior and improve sales

Social media is the biggest source of Big Data. Because of this, 90% of Fortune 500 companies are investing in Big Data initiatives that will help them predict consumer behavior to produce better sales results. Written by Dr. Gabor Szabo, a Senior Data Scientist at Twitter, and Dr. Oscar Boykin, a Software Engineer at Twitter, Social Media Data Mining and Analytics shows analysts how to use sophisticated techniques to mine social media data, obtaining the information they need to generate amazing results for their businesses.

Social Media Data Mining and Analytics isn't just another book on the business case for social media. Rather, this book provides hands-on examples for applying state-of-the-art tools and technologies to mine social media - examples include Twitter, Facebook, Pinterest, Wikipedia, Reddit, Flickr, Web hyperlinks, and other rich data sources. In it, you will learn:

  • The four key characteristics of online services-users, social networks, actions, and content
  • The full data discovery lifecycle-data extraction, storage, analysis, and visualization
  • How to work with code and extract data to create solutions
  • How to use Big Data to make accurate customer predictions

Szabo and Boykin wrote this book to provide businesses with the competitive advantage they need to harness the rich data that is available from social media platforms.



GABOR SZABO, PHD, is a Senior Staff Software Engineer at Tesla and a former data scientist at Twitter, where he focused on predicting user behavior and content popularity in crowdsourced online services, and on modeling large-scale content dynamics. He also authored the PyCascading data processing library.

GUNGOR POLATKAN, PHD, is a Tech Lead/Engineering Manager designing and implementing end-to-end machine learning and artificial intelligence offline/online pipelines for the LinkedIn Learning relevance backend. He was previously a machine learning scientist at Twitter, where he worked on topics such as ad targeting and user modeling.

P. OSCAR BOYKIN, PHD, is a software engineer at Stripe where he works on machine learning infrastructure. He was previously a Senior Staff Engineer at Twitter, where he worked on data infrastructure problems. He is coauthor of the Scala big-data libraries Algebird, Scalding and Summingbird.

ANTONIOS CHALKIOPOULOS, MSC, is a Distributed Systems Specialist. A system engineer who has delivered fast/big data projects in media, betting, and finance, he is now leading the effort on the Lenses platform for data streaming as a co-founder and CEO at https://lenses.stream.

Autorentext

GABOR SZABO, PHD, is a Senior Staff Software Engineer at Tesla and a former data scientist at Twitter, where he focused on predicting user behavior and content popularity in crowdsourced online services, and on modeling large-scale content dynamics. He also authored the PyCascading data processing library.

GUNGOR POLATKAN, PHD, is a Tech Lead/Engineering Manager designing and implementing end-to-end machine learning and artificial intelligence offline/online pipelines for the LinkedIn Learning relevance backend. He was previously a machine learning scientist at Twitter, where he worked on topics such as ad targeting and user modeling.

P. OSCAR BOYKIN, PHD, is a software engineer at Stripe where he works on machine learning infrastructure. He was previously a Senior Staff Engineer at Twitter, where he worked on data infrastructure problems. He is coauthor of the Scala big-data libraries Algebird, Scalding and Summingbird.

ANTONIOS CHALKIOPOULOS, MSC, is a Distributed Systems Specialist. A system engineer who has delivered fast/big data projects in media, betting, and finance, he is now leading the effort on the Lenses platform for data streaming as a co-founder and CEO at https://lenses.stream.

Inhalt

Introduction xvii

Chapter 1 Users: TheWho of Social Media 1

Measuring Variations in User Behavior in Wikipedia 2

The Diversity of User Activities 3

The Origin of the User Activity Distribution 12

The Consequences of the Power Law 20

The Long Tail in Human Activities 25

Long Tails Everywhere: The 80/20 Rule (p/q Rule) 28

Online Behavior on Twitter 32

Retrieving Tweets for Users 33

Logarithmic Binning 36

User Activities on Twitter 37

Summary 39

Chapter 2 Networks: The How of Social Media 41

Types and Properties of Social Networks 42

When Users Create the Connections: Explicit Networks 43

Directed Versus Undirected Graphs 45

Node and Edge Properties 45

Weighted Graphs 46

Creating Graphs from Activities: Implicit Networks 48

Visualizing Networks 51

Degrees: The Winner Takes All 55

Counting the Number of Connections 57

The Long Tail in User Connections 58

Beyond the Idealized Network Model 62

Capturing Correlations: Triangles, Clustering, and Assortativity 64

Local Triangles and Clustering 64

Assortativity 70

Summary 75

Chapter 3 Temporal Processes: The When of Social Media 77

What Traditional Models Tell You About Events in Time 77

When Events Happen Uniformly in Time 79

Inter-Event Times 81

Comparing to a Memoryless Process 86

Autocorrelations 89

Deviations from Memorylessness 91

Periodicities in Time in User Activities 93

Bursty Activities of Individuals 99

Correlations and Bursts 105

Reservoir Sampling 106

Forecasting Metrics in Time 110

Finding Trends 112

Finding Seasonality 115

Forecasting Time Series with ARIMA 117

The Autoregressive Part (AR) 118

The Moving Average Part (MA) 119

The Full ARIMA(p, d, q) Model 119

Summary 121

Chapter 4 Content: The What of Social Media 123

Defining Content: Focus on Text and Unstructured Data 123

Creating Features from Text: The Basics of Natural Language Processing 125

The Basic Statistics of Term Occurrences in Text 128

Using Content Features to Identify Topics 129

The Popularity of Topics 138

How Diverse Are Individual Users' Interests? 141

Extracting Low-Dimensional Information from High-Dimensional Text 144

Topic Modeling 145

Unsupervised Topic Modeling 147

Supervised Topic Modeling 155

Relational Topic Modeling 162

Summary 169

Chapter 5 Processing Large Datasets 171

Map Reduce: Structuring Parallel and Sequential Operations 172

Counting Words 174

Skew: The Curse of the Last Reducer 177

Multi-Stage MapReduce Flows 179

Fan-Out 180

Merging Data Streams 181

Joining Two Data Sources 183

Joining Against Small Datasets 186

Models of Large-Scale MapReduce 187

Patterns in MapReduce Programming 188

Static MapReduce Jobs 188

Iterative MapReduce Jobs 195

PageRank for Ranking in Graphs 195

K-means Clustering 199

Incremental MapReduce Jobs 203

Temporal MapReduce Jobs 204

Rollups and Data Cubing 205

Expanding Rollup Jobs 211

Challenges with Processing Long-Tailed Social Media Data 212

Sampling and Approximations: Getting Results with Less Computation 214

HyperLogLog 217

HyperLogLog Example 219

HyperLogLog on the Stack Exchange Dataset 221

Performance of HLL on Large Datasets 222

Bloom Filters 223

A Bloom Filter Example 226

Bloom Filter as Pre-Computed Members...

Produktinformationen

Titel: Social Media Data Mining and Analytics
Autor:
EAN: 9781118824900
Digitaler Kopierschutz: Adobe-DRM
Format: E-Book (pdf)
Hersteller: Wiley
Genre: Informatik
Anzahl Seiten: 352
Veröffentlichung: 18.09.2018
Dateigrösse: 30.4 MB