使用tidyr,dplyr和ggvis使用R进行数据管理– Part 0

嘿伙计,

in the upcoming weeks I would like to talk about 数据 management in R.  I will show you how to tidy, transform and visualise your 数据 using three popular and powerful 数据 management packages: 提迪尔, dplyr and ggvis. This is what you can expect in the next weeks:

通常,数据管理可以分为三个主要步骤:

1.整理: Data are often messy, before 数据 can be analysed it has to be tidied up. The basic rule is to have each variable saved in its own column and every observation saved in its own row. For this purpose I will show you the package 提迪尔 developed by the RStudio Team that makes cleaning and tidying 数据 much easier.

2. Transfom: Usually before you can jump into your 数据 analysis you have to first transform, subset and filter your 数据. dplyr is a package that is specialized on 数据 frames and supposed to be much faster than plyr and other comparable packages. I will show you how to use this package by explaining the most frequently used functions like select, filter, mutate,…

3,可视化 Once your 数据 is tidy and transformed you can start your models and analysis.  The generated 数据 and information usually has to be visualised. For this purpose I will show you the package ggvis. It is like ggplot2 built on concepts from the grammar of graphics. In addition to that it adds interactivity, a new 数据 pipeline, and it renders in a web browser which enables an easy sharing and publishing of your results.

希望在接下来的几周内见到您!

干杯

马丁

 

关于作者

马丁出生于捷克共和国,就读于维也纳自然资源与生命科学大学。他目前在GeoVille(一家位于奥地利的地球观测公司)工作,专门从事土地监测。他的主要兴趣是:开源应用程序,例如R,(地理空间)统计和数据管理,Web映射和可视化。他喜欢旅行,寻宝,摄影和运动。

发表回复

*