Why?

July 1, 2015

useR 2015: Networks

Filed under: R, useR 2015 — csgillespie @ 9:39 am

These are my initial notes from useR 2015. Will revise when I have time.

fbRads: Analyzing and managing Facebook ads from R (Gergely Daroczi)

Modern advertising

Google/Amazon/Facebook use our information

Ad platforms: Google: RAdwords, facebook likes: fbRads. You can use the facebook API to get information from facebook. Get hashes of email address, not the actual address. In the last few years, the API has changed.

Example

  1. Grab useR's email addresses from CRAN and R-help mailing list.
  2. Create a facebook app with API to get a token.
  3. Create a custom audience.
  4. Create lookalike audiences: get facebook users who are similar to my target list.
  5. Define audience, ad and budget.
  6. Upload an image and description.
  7. Run A/B testing.

The performance metrics API is still being developed.

Web scraping with R – A fast track overview. (Peter Meißner)

There are a number of R packages for web-scraping.

Two problems:

  • Download: protocols/procedures, i.e. HTTP, cookies, POST, GET
  • extraction: parsing/extraction/cleansing, i.e. XML, JSON, html into R

Reading text from the Web

The simplest solution is to use readLines, then use some regular expressions (either with base R or stringr or …).

Reading HTML/XML

Use rvest and use xml_structure to view the structure of the XML scheme. To extract text, we need to use XPath (still using rvest). Within rvest there are a number of convenience functions, e.g. html_table to get a list of tables.

JSON

Use jsonlite to translate JSON to a data frame.

HTML forms/HTTP methods

Use httr and rvest packages.

Overcoming the Javascript Barrier

Use RSelenium for browser automation

Conclusion

  • Don't use Windows for web scraping. Use Linux (or if you must, a Mac)
  • Start with stringr, rvest, jsonlite
  • Need to learn regular expression, file manipulation
  • Before scraping, look for the download button

multiplex: Analysis of Multiple Social Networks with Algebra (Antonio Rivero Ostoic)

Motivation

  • multiplex is a package designed to perform algebraic analyses of multiple networks (but isn't limited to algebra)
  • The function zbind creates multivariate network data from arrays
  • perm manipulates network data

Two-mode networks are represented in a Galois framework. This makes analysis easier(?)

What's new in igraph and networks (Gabor Csardi)

Abstract

igraph is the premier R package for the analysis of network data and it went through major restructuring recently and has changed a lot since last time it was featured at useR! in 2008. This talk introduces the new/updated features of igraph: – Simplified ways of graph manipulation. – New methods community detection. – New layouts for graph visualization. – New statistical methods: graphlets, embeddings, graph matching, cohesive blocks, etc. – How to use igraph graphs with new visualization tools: DiagrammeR, D3, etc.

The igraph package deals mainly with infrastructure. It's actually a C library, with an R and python interface.

What's new: [ and [[

The [ operator makes the graph behave like an adjacency matrix. For example, to check if an edge exists, use air["BOS", "SFO"]. Can also use it to manipulate the network, e.g. to add or remove edges.

The [[ can be used to get all adjacent vertices

What's new: consistent function names and manipulators

  • make_*, sample_*, cluster_, layout_*, graph_from_*
  • manipulators: make_ and sample_
  • Pipe friendly syntax
  • Easier connection to other packages, e.g. networkD3

Current work

  • Better connection to other packages
  • Inference
  • Infrastructure cleanup

Please note that the notes/talks section of this post is merely my notes on the
presentation. I may have made mistakes: these notes are not guaranteed to be
correct. Unless explicitly stated, they represent neither my opinions nor the
opinions of my employers. Any errors you can assume to be mine and not the
speaker’s. I’m happy to correct any errors you may spot – just let me know!

Advertisements

1 Comment »

  1. Regarding your post on `multiplex´: “The concept lattice of the context” that is product of Galois derivations makes easier to analyse two-mode networks.

    Comment by Antonio Rivero Ostoic — July 7, 2015 @ 8:13 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: