Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Samenvatting

Samenvatting Hoorcolleges Computational Analysis of Digital Communication

Beoordeling
-
Verkocht
-
Pagina's
71
Geüpload op
08-12-2022
Geschreven in
2022/2023

Een samenvatting van hoorcollege 1 tot en met 4 van Computational Analysis of Digital Communication aan de VU.

Instelling
Vak

Voorbeeld van de inhoud

Lecture 1 - Introduction to Computational Methods - 31/10/2022
Computational Social Science: Field of Social Science that uses algorithmic tools and
large/unstructured data to understand human and social behavior. Computational methods
as “microscope”: Methods are not the goal, but contribute to theoretical development and/or
data generation. Complements rather than replaces traditional methodologies. Includes
methods such as:
★ Advanced data wrangling/data science
★ Combining of different data sets
★ Automated Text Analysis
★ Machine Learning (supervised and unsupervised)
★ Actor-based modeling
★ Simulations

Typical Workflow




Why is this important now?
★ In the past, collecting data was expensive (surveys, observations…).
★ In the digital age, the behaviors of billions of people are recorded, stored, and
therefore analyzable.
★ Every time you click on a website, make a call on your mobile phone, or pay for
something with your credit card, a digital record of your behavior is created and
stored.
★ Because (meta-)data are a byproduct of people’s everyday actions, they are often
called digital traces.
★ Large-scale records of persons or businesses are often called big data.

,10 characters of big data
Big The scale or volume of some current datasets is often impressive.
However, big datasets are not an end in themselves, but they can
enable certain kinds of research including the study of rare events, the
estimation of heterogeneity, and the detection of small differences.

Always-on Many big data systems are constantly collecting data and thus enable
them to study unexpected events and allow for real-time measurement.

Nonreactive Participants are generally not aware that their data are being captured
or they have become so accustomed to this data collection that it no
longer changes their behavior.

Incomplete Most big data sources are incomplete, in the sense that they don’t
have the information that you will want for your research. This is a
common feature of data that was created for purposes other than
research.

Inaccessible Data held by companies and governments are difficult for researchers
to access.

Non Most big datasets are nonetheless not representative of certain
Representative populations. Out-of-sample generalizations are hence difficult or
impossible.

Drifting Many big data systems are changing constantly, thus making it difficult
to study long-term trends.

Algorithmically Behavior in big data systems is not natural; it is driven by the
confounded engineering goals of the systems.

Dirty Big data often includes a lot of noise (e.g., junk, spam, spurious data
points…)

Sensitive Some of the information that companies and governments have is
sensitive.


Example: Smartphone log data:
★ Big: Thousands of rows per person, but not many columns.
★ Always-on: Recorded smartphone use at all times.
★ Incomplete: Did not record app use with higher privacy standards
★ Dirty: Depending on what you want to study, lots of noise.

,Typical computational research strategies
1. Counting things: In the age of big data, researcher can “count” more than ever
- How often do people use their smartphone per day?
- About which topics do news websites write most often?
2. Forecasting and nowcasting: Big data allow for more accurate predictions both in
the present and in the future
- Investigate when people disclose themselves in computer-mediated
communication
- Crime prediction
3. Approximating experiments: Computational methods provide opportunities to
conduct “natural experiments”
- Compare smartphone log data of people who use their smartphone naturally
vs. those who abstain from certain apps (e.g., social media apps)
- Investigate the potential of nudges to make users select certain news

Advantages and disadvantages
★ Advantages of Computational Methods: Actual behavior versus self-report, social
context versus lab setting, small N versus large N.
★ Disadvantages of Computational Methods: Techniques often complicated, data often
proprietary, samples often biased, insufficient metadata.

Computational Communication Science (CCS): the label applied to the emerging subfield
that investigates the use of computational algorithms to gather and analyze big and often
semi- or unstructured data sets to develop and test communication science theories.

Promises
The recent acceleration in the use of computational methods for communication science is
primarily fueled by the confluence of at least three developments:
★ vast amounts of digitally available data, ranging from social media messages and
other digital traces to web archives and newly digitized newspaper and other
historical archives.
★ improved tools to analyze this data, including network analysis methods and
automatic text analysis methods such as supervised text classification, topic
modeling, word embeddings, and syntactic methods.
★ powerful and cheap processing power, and easy to use computing infrastructure for
processing these data, including scientific and commercial cloud computing, sharing
platforms such as Github and Dataverse, and crowd coding platforms such as
Amazon MTurk and Crowdflower.

Ethical problems with computational methods
★ More power over participants than in the past
- Data collection without awareness/consent
- Manipulation without awareness/consent
- Data potentially sensitive, individual users identifiable
★ Guiding principles:
- Respect for persons: Treating people as autonomous and honoring their
wishes.

, - Beneficence: Understanding and improving the risk/benefit profile of a study.
- Justice: Risks and benefits should be evenly distributed.
- Respect for law and public interest

Challenges of computational communication science
★ Simply data-driven research questions might not be theoretically interesting
★ Proprietary data threatens accessibility and reproducibility
★ ‘Found’ data not always representative, threatening external validity
★ Computational method bias and noise threaten accuracy and internal validity
★ Inadequate ethical standards/procedures

Preliminary summary
★ Computational communication research holds manifold promises.
★ We can harness unusual sources of information and large amounts of data,
particularly because people constantly leave digital traces.
★ New methods allow to structure, aggregate and make sense of these data and
extract meaningful information to study communication behavior and phenomena.
★ However, computational communication research comes with ethical challenges
related to consent, privacy, and autonomy of the participants.

Exam: 40% - 35 MC, 6 open-ended
Homework: 30%
Group presentation: 30% - 10 minute talk

Example exam question (MC)
Why is the “Facebook Manipulation Study” by Kramer et al. ethically problematic?
A. People didn't know that they took part in a study (no informed consent)
B. It overly manipulated people’s emotion
C. Both A and B are true
D. The study was not ethically problematic

Example exam question (Open format)
Name and explain two characteristics of big data.
1. Big data is often “incomplete”: This means they do not have the information that you
will want for your research. This is a common feature of data that was created for
purposes other than research. For example, log data (e.g., browser history) includes
all links a person has visited over time, but does not provide any additional
information. Moreover, it may contain gaps where the software failed or the person
purposefully hid his surfing behavior.
2. Big data is often “algorithmically confounded”: Behavior in big data systems is not
natural; it is driven by the engineering goals of the systems. For example, what you
see on a facebook news feed depends on an algorithm that Facebook has built into
their platform. Behavior of individuals is thus also driven by these system-immanent
features.

Geschreven voor

Instelling
Studie
Vak

Documentinformatie

Geüpload op
8 december 2022
Aantal pagina's
71
Geschreven in
2022/2023
Type
SAMENVATTING

Onderwerpen

$7.17
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper
Seller avatar
Sterrevermond

Maak kennis met de verkoper

Seller avatar
Sterrevermond Universiteit van Amsterdam
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
2
Lid sinds
3 jaar
Aantal volgers
0
Documenten
6
Laatst verkocht
2 jaar geleden

0.0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen