Fast Powerful Efficient Joining Without Joining to Explore
- Slides: 16
Fast, Powerful, Efficient: Joining Without Joining to Explore Summer Games Data With JMP® Mandy Chambers, Principal Development Tester & Chung-Wei Ng, Principal Systems Developer JMP, SAS® Abstract Summer Games Facts Process The Summer Games occur every four years. I participated in a number of sports growing up and loved watching these games. I thought it would be interesting to use JMP to look at some historical data. What country wins the most medals? What is the average of Equestrian participants? Michael Phelps became the youngest US swimmer in 68 years when he debuted in 2000 at the age of 15, finishing 5 th in the 200 -meter butterfly, with no medals won. But today, at 31 he holds the record for most medals won with 28, (23 of those being gold) and is the most decorated Olympian of all time. • Imported Excel files from The Olympic Studies Centre, JMP has a new feature that we know will score big points with our users. Joining data from one table to another, without having to physically join the tables. Introducing New team member Virtual Join! • • • the International Olympic Committee (2008, 2012 medal winners; recently added 2016 medal winners) From main JMP menu, selected File Internet Open to get data from www. sports-reference. com (2012 participant data, History of Summer Games, Medals all time - retrieved July 2016) Shaped data by formatting columns to ensure data types would match for linking together. Used Recode to have better matches on country and name Defined formula columns to calculate BMI for athletes, and formula columns to duplicate a column for Link ID purposes Decided which columns of data had unique ID identifiers to determine the best way to pull data together Launched platforms for data analysis and visualization to show to join tables together using this new feature for JMP 13 • • The five rings represent the union of the five continents and the meeting of athletes from all over the world at the Summer Games. In Brazil, the flame was transported 20, 000 km by road, 10, 000 km by air and carried by 12, 000 torchbearers. 2016 was the first time in the summer games history that a South American country had hosted the games. Rugby sevens made its debut in 2016, and golf returned after a 112 -year absence. (Fiji won first gold medal in Rugby) The last gold medals made entirely out of gold were in 1912. Now, gold and silver medals are 92. 5 percent silver with the gold medal covered in six grams of gold. In the opening ceremonies, the procession of the athletes is always led by the Greek team, followed by all the other teams in alphabetical order (in the language of the hosting country). The team of the hosting country always enters last. The games in 2016 were the first to feature participants born in the year 2000! Objectives • • Ability to explore and join data from multiple tables Assign Link IDs and Link References to relate columns in one table to another Avoid creating large and unwieldy tables with many columns and rows Explore data without performing a physical join thus saving on memory storage Click here for more about Virtual Join! Data Tables Click any picture to zoom in
Fast, Powerful, Efficient: Joining Without Joining to Explore Summer Games Data With JMP ® Mandy Chambers, Principal Development Tester & Chung-Wei Ng, Principal Systems Developer JMP, SAS® Steps to Set Link Id/ Link Reference Relationship between Virtual Joined tables Follow steps to ensure tables are joined properly • Different ways to create Link ID Simple example of Virtual Join Mapping of tables - Right click column menu - Column properties, select table - Scripting • Different ways to set up Link Reference columns - same ways as above for Link ID • • Click any picture to zoom in • Medal. Winner 2012 person. Info is a referencing table and a lookup table 2008 Medals. All, 2012 Medals. All are referencing tables, using look-through to retrieve data from Olym. Countries. ALL & Summer. Medals. All. Time, which are lookup tables Historyof. Summer. Games is a lookup table for Summer. Medals. Alll. Time When columns are linked correctly the Link Reference key is blue If the link is not set to the correct table or the table is not open, the key will be gray Examples using Virtual Join Reference columns are marked by the icon shown below
Fast, Powerful, Efficient: Joining Without Joining to Explore Summer Games Data With JMP ® Mandy Chambers, Principal Development Tester & Chung-Wei Ng, Principal Systems Developer JMP, SAS® Virtual Join – Data Results Linked Compare Distributions for Different Years Over Time Medal Winners – Totals for 2012 • Treemap with Sport by Country & Gender, Total medals • Use local data filter to drill into Swimming data • Notice label for United States of America shows the flag Observations Virtual join allows you to link tables without a physical join Has the potential to save on memory and storage of data Many platforms recognize linked data columns A great feature for operational data because it provides the ability to retrieve data from multiple lookup tables • • Click any picture to zoom in References & Acknowledgements www. sports-reference. org Olympics Studies Centre from International Olympic Committee 3 Carrie Hughston, diving photo 4 Chung-Wei Ng, Developer of Virtual Join in JMP 5 www. nih. gov 6 Poster created July 2016 1 2
Can you locate Rio de Janeiro? Can you guess the total medal count for Brazil? Hint: Look near flag label Return to Poster
Virtual Join Facts Link ID is a column property that identifies a unique key for the data table. This column is called the Link ID column, analogous to a primary key in a database table. Only one column in a table can be designated with the Link ID property. (Gold key appears in columns panel) Link Reference is a column property that identifies the table to be linked or virtually joined to the referencing column. This column is called the Link Reference column, analogous to a foreign key in a database table. A table may have multiple Link Reference columns. (Blue key appears in columns panel) NOTE: The values of the Link Reference column are mapped to the values of the Link ID column of the lookup table. Virtual Join allows you to explore data using a referencing column in a table to link to other lookup data tables and get more information without using more memory. The join is made using the Link Reference column in the referencing table and the Link ID column in the lookup table. Return to Poster
Data Table Examples • Table above shows Link Reference key is blue • Table to the right shows Link ID is yellow • Reference columns are grouped & marked by the icon you see below • These columns are hidden and locked Return to Poster
Different ways to set up Link ID columns //snippet of jsl New Column("Country 1 A", Character, "Nominal", Set Property("Link ID", 1), Return to Poster
Different ways to set up Link Reference columns Select Table allows you to search for your table on your computer more easily //snippet of jsl Set Property("Link Reference", Reference Table( “Olym. Countries. ALL. jmp" )), Return to Poster
Relationship among Virtual Join tables (Simple Example) Medalists Rio 2016 has access to all the columns of data from Olympic. Countries. ALL_2016 Medalists Rio 2016 is a referencing table. Olympic. Countries. ALL_2016 is a lookup table. Return to Poster
Virtual Joined Tables - Analysis and Visualization Fiji flag List of Countries winning their first gold medals. Kosovo won Women’s Judo, and Singapore took their first gold in 100 meter Butterfly, defeating Michael Phelps who won the silver. Labels are coming from linked data table (Olympic. Countries. ALL_2016), from the column Country_Flag (NOTE: Saves memory in your referencing data table by eliminating bringing in the actual column) Return to Poster Kosovo flag
Relationship among Virtual Join tables 2008 Medals. All & 2012 Medals. All have access to all the columns from the other tables because of “pass through” or look-through that is taking place due to the way the tables are linked together. Medal. Winner 2012 person. Info table in the center is both a referencing table and a lookup table. Historyof. Summer. Games is a lookup table for Summer. Medals. Alll. Time. JMP employee Mandy Chambers swimming for leisure Return to Poster
Compare Distributions – Dashboard NOTE: Athletics is more commonly known as Track and Field Return to Poster
Treemap Example of Virtual Join Data - 1 Return to Poster
Treemap Example of Virtual Join Data - 2 • • • Select United States of America for both Female and Male (Red) Filters data to show a subset even deeper – creates a table with US swimmers selected From referencing table, Tables -> Subset makes a subset table showing only US medal winners Return to Poster
Heatmap Examples According to the NIH website: Underweight = <BMI of 18. 5 Normal weight = BMI of 18. 5– 24. 9 Overweight = BMI of 25– 29. 9 Obesity = BMI of 30 or greater Equestrian for both Men and Women show average of around 40. Female Cycling-Road also shows red in this heatmap – Gold medal winner Kristen Armstrong from USA is 41 and still winning today!
Summer Games SAS employee Carrie Knoeber Hughston diving at University of Kentucky in 1999 Nice summary Tabulate shows host Countries for Summer Games through 2012, total medals all time per country, & participant totals Return to Poster
- Productively efficient vs allocatively efficient
- Productively efficient vs allocatively efficient
- Productively efficient vs allocatively efficient
- Productively efficient vs allocatively efficient
- Productively efficient vs allocatively efficient
- Differentiate between acid fast and non acid fast bacteria
- Example of acid-fast bacteria
- Hold fast to your dreams for without them life is a broken
- Poem without title
- Diane glancy without title
- Poetic devices in keeping quiet
- What two countries did henry hudson explore for
- What motivates people to explore the unknown
- Clcc imperial
- College physics explore and apply 2nd edition answers
- Explore evolve validate prototype
- Lesson 1 europeans explore overseas