Cloud Computing Pig Pig Pig Pig Latin Map
- Slides: 20
Cloud Computing 数据处理平台-Pig
Pig的基本框架 ◦ Pig的基本框架 Pig Latin Map. Reduce Cluster
Pig Latin编程语言 数据类型 类型 int 说明 有符号的32位整数 示例 127 long 有符号的64位整数 127 L float 32位浮点数 3. 14 F double 64位浮点数 3. 14 chararray UTF-8格式的字符 数组 Byte数组 Hello World bytearray
Pig Latin编程语言 运算符 ◦ Bicond运算符举例 grunt>A = LOAD ‘data. txt’ AS (f 1: int, f 2: int, B: bag{T: tuple(t 1: int, t 2: int)}); grunt>DUMP A; (3, 2, {(1, 7), (3, 5)}) (3, 3, {(1, 7), (3, 5)}) (3, 5, {(1, 7, ), (3, 5), (4, 6)}) 执行下面的操作: grunt>X = FOREACH A GENERATE f 2, (f 2 == 2 ? 1 : COUNT(B)); grunt>DUMP X; (2, 1) (3, 2 L) (5, 3 L)
Pig Latin编程语言 实例分析:在每个category中找到最访问的10个页面 Visits Url Info User Url Time Url Category Page. Rank Amy cnn. com 8: 00 cnn. com News 0. 9 Amy bbc. com 10: 00 bbc. com News 0. 8 Amy flickr. com 10: 05 flickr. com Photos 0. 7 Fred cnn. com 12: 00 espn. com Sports 0. 9
Pig Latin编程语言 Load Visits Group by url Foreach url generate count Load Url Info Join on url Group by category Foreach category generate top 10 urls
Pig Latin编程语言 Pig Latin实现 visits = load ‘/data/visits’ as (user, url, time); g. Visits = group visits by url; visit. Counts = foreach g. Visits generate url, count(visits); url. Info = load ‘/data/url. Info’ as (url, category, p. Rank); visit. Counts = join visit. Counts by url, url. Info by url; g. Categories = group visit. Counts by category; top. Urls = foreach g. Categories generate top(visit. Counts, 10); store top. Urls into ‘/data/top. Urls’;
Pig Latin编程语言 Map. Reduce作业 Map 1 Load Visits Group by url 每个group或者join操作都形 成一个map-reduce的界限 Reduce 1 Foreach url generate count Map 2 Load Url Info Join on url Group by category Foreach category generate top 10(urls) Reduce 2 Map 3 Reduce 3
- Computing refers to
- Map reducing in cloud computing
- Conventional computing and intelligent computing
- Vodafone business cloud
- Hardware assisted virtualization in cloud computing
- Structure of virtualization in cloud computing
- Clouds definition
- Cloud computing reference model
- Nectar cloud computing
- All resources are tightly coupled in computing paradigm of
- Pay-per-use monitor
- Mobikida
- Scalability issues in cloud computing
- Conclusion of cloud computing
- Unified management software in cloud computing
- Nist cloud security reference architecture
- Nimbus in cloud computing
- Cloud computing cambridge
- Case study on microsoft azure in cloud computing
- Cloud computing layers
- Regarder introduction to cloud computing