These slides are viewed best by Chrome or Firefox and occasionally need to be refreshed if elements did not load properly. See here for the PDF .
Press the right arrow to progress to the next slide!
Lecturer: Emi Tanaka
Department of Econometrics and Business Statistics
ETC5512.Clayton-x@monash.edu
Week 4
Recall from lecture 2:
Collecting data on the entire population is normally too expensive or infeasible! (If we can, it is called a census.)
We therefore collect data only on a subset of the population.
Recall from lecture 2:
Collecting data on the entire population is normally too expensive or infeasible! (If we can, it is called a census.)
We therefore collect data only on a subset of the population.
Recall from lecture 2:
Collecting data on the entire population is normally too expensive or infeasible! (If we can, it is called a census.)
We therefore collect data only on a subset of the population.
Recall from lecture 2:
Collecting data on the entire population is normally too expensive or infeasible! (If we can, it is called a census.)
We therefore collect data only on a subset of the population.
How often is the census conducted in Australia?
Why do we run the census?
ABS is the independent statistical agency of the Government of Australia.
If you are from outside Australia, find the statistical government agency in your country , e.g.
ABS is the independent statistical agency of the Government of Australia.
If you are from outside Australia, find the statistical government agency in your country , e.g.
The first Australian census was held in 1911.
Since 1961, the census occurs every 5 years in Australia.
The first Australian census was held in 1911.
Since 1961, the census occurs every 5 years in Australia.
The census in 2016 at a cost of $440 million.
The first Australian census was held in 1911.
Since 1961, the census occurs every 5 years in Australia.
The census in 2016 at a cost of $440 million.
The next census will be held in 2026!
The first Australian census was held in 1911.
Since 1961, the census occurs every 5 years in Australia.
The census in 2016 at a cost of $440 million.
The next census will be held in 2026!
The ABS is legislated to collect and disseminate census data under the ABS Act 1975 and Census and Statistics Act 1905.
The first Australian census was held in 1911.
Since 1961, the census occurs every 5 years in Australia.
The census in 2016 at a cost of $440 million.
The next census will be held in 2026!
The ABS is legislated to collect and disseminate census data under the ABS Act 1975 and Census and Statistics Act 1905.
Similar legislation are in place in many countries.
There are two main types of data that you can download:
Today,
First, pray hard that there is some description!
Without some description or understanding of the variables, it will be near impossible to extract meaningful information from the data.
First, pray hard that there is some description!
Without some description or understanding of the variables, it will be near impossible to extract meaningful information from the data.
"About DataPacks_readme.md - "Read Me" documentation containing helpful information for users about the data and how it is structured (.md)"
First, pray hard that there is some description!
Without some description or understanding of the variables, it will be near impossible to extract meaningful information from the data.
"About DataPacks_readme.md - "Read Me" documentation containing helpful information for users about the data and how it is structured (.md)"
We could also try going through the meta-data.
We could also try going through the meta-data.
Metadata_2016_GCP_DataPack.xlsx
Table number | Table name | Table population |
---|
Table number | Table name | Table population |
---|---|---|
G17 | Total Personal Income (Weekly) by Age by Sex | Persons aged 15 years and over |
G28 | Total Family Income (Weekly) by Family Composition | Families in family households |
G29 | Total Household Income (Weekly) by Household Composition | Occupied private dwellings |
... | ... | ... |
Let's open 2016Census_geog_desc_1st_2nd_3rd_release.xlsx
Let's open 2016Census_geog_desc_1st_2nd_3rd_release.xlsx
... and there are the region names of each geographical code.
Let's open 2016Census_geog_desc_1st_2nd_3rd_release.xlsx
... and there are the region names of each geographical code.
Let's go with the easy one: STE
Victoria.
Why is the table organised like this?
2016Census_G17A_VIC_STE.csv
STE_CODE_2016 | M_Neg_Nil_income_15_19_yrs | M_Neg_Nil_income_20_24_yrs | M_Neg_Nil_income_25_34_yrs | M_Neg_Nil_income_35_44_yrs | M_Neg_Nil_income_45_54_yrs | M_Neg_Nil_income_55_64_yrs | M_Neg_Nil_income_65_74_yrs | M_Neg_Nil_income_75_84_yrs | M_Negtve_Nil_incme_85_yrs_ovr | M_Neg_Nil_income_Tot | M_1_149_15_19_yrs | M_1_149_20_24_yrs | M_1_149_25_34_yrs | M_1_149_35_44_yrs | M_1_149_45_54_yrs | M_1_149_55_64_yrs | M_1_149_65_74_yrs | M_1_149_75_84_yrs | M_1_149_85ov | M_1_149_Tot | M_150_299_15_19_yrs | M_150_299_20_24_yrs | M_150_299_25_34_yrs | M_150_299_35_44_yrs | M_150_299_45_54_yrs | M_150_299_55_64_yrs | M_150_299_65_74_yrs | M_150_299_75_84_yrs | M_150_299_85ov | M_150_299_Tot | M_300_399_15_19_yrs | M_300_399_20_24_yrs | M_300_399_25_34_yrs | M_300_399_35_44_yrs | M_300_399_45_54_yrs | M_300_399_55_64_yrs | M_300_399_65_74_yrs | M_300_399_75_84_yrs | M_300_399_85ov | M_300_399_Tot | M_400_499_15_19_yrs | M_400_499_20_24_yrs | M_400_499_25_34_yrs | M_400_499_35_44_yrs | M_400_499_45_54_yrs | M_400_499_55_64_yrs | M_400_499_65_74_yrs | M_400_499_75_84_yrs | M_400_499_85ov | M_400_499_Tot | M_500_649_15_19_yrs | M_500_649_20_24_yrs | M_500_649_25_34_yrs | M_500_649_35_44_yrs | M_500_649_45_54_yrs | M_500_649_55_64_yrs | M_500_649_65_74_yrs | M_500_649_75_84_yrs | M_500_649_85ov | M_500_649_Tot | M_650_799_15_19_yrs | M_650_799_20_24_yrs | M_650_799_25_34_yrs | M_650_799_35_44_yrs | M_650_799_45_54_yrs | M_650_799_55_64_yrs | M_650_799_65_74_yrs | M_650_799_75_84_yrs | M_650_799_85ov | M_650_799_Tot | M_800_999_15_19_yrs | M_800_999_20_24_yrs | M_800_999_25_34_yrs | M_800_999_35_44_yrs | M_800_999_45_54_yrs | M_800_999_55_64_yrs | M_800_999_65_74_yrs | M_800_999_75_84_yrs | M_800_999_85ov | M_800_999_Tot | M_1000_1249_15_19_yrs | M_1000_1249_20_24_yrs | M_1000_1249_25_34_yrs | M_1000_1249_35_44_yrs | M_1000_1249_45_54_yrs | M_1000_1249_55_64_yrs | M_1000_1249_65_74_yrs | M_1000_1249_75_84_yrs | M_1000_1249_85ov | M_1000_1249_Tot | M_1250_1499_15_19_yrs | M_1250_1499_20_24_yrs | M_1250_1499_25_34_yrs | M_1250_1499_35_44_yrs | M_1250_1499_45_54_yrs | M_1250_1499_55_64_yrs | M_1250_1499_65_74_yrs | M_1250_1499_75_84_yrs | M_1250_1499_85ov | M_1250_1499_Tot | M_1500_1749_15_19_yrs | M_1500_1749_20_24_yrs | M_1500_1749_25_34_yrs | M_1500_1749_35_44_yrs | M_1500_1749_45_54_yrs | M_1500_1749_55_64_yrs | M_1500_1749_65_74_yrs | M_1500_1749_75_84_yrs | M_1500_1749_85ov | M_1500_1749_Tot | M_1750_1999_15_19_yrs | M_1750_1999_20_24_yrs | M_1750_1999_25_34_yrs | M_1750_1999_35_44_yrs | M_1750_1999_45_54_yrs | M_1750_1999_55_64_yrs | M_1750_1999_65_74_yrs | M_1750_1999_75_84_yrs | M_1750_1999_85ov | M_1750_1999_Tot | M_2000_2999_15_19_yrs | M_2000_2999_20_24_yrs | M_2000_2999_25_34_yrs | M_2000_2999_35_44_yrs | M_2000_2999_45_54_yrs | M_2000_2999_55_64_yrs | M_2000_2999_65_74_yrs | M_2000_2999_75_84_yrs | M_2000_2999_85ov | M_2000_2999_Tot | M_3000_more_15_19_yrs | M_3000_more_20_24_yrs | M_3000_more_25_34_yrs | M_3000_more_35_44_yrs | M_3000_more_45_54_yrs | M_3000_more_55_64_yrs | M_3000_more_65_74_yrs | M_3000_more_75_84_yrs | M_3000_more_85ov | M_3000_more_Tot | M_PI_NS_15_19_yrs | M_PI_NS_ns_20_24_yrs | M_PI_NS_ns_25_34_yrs | M_PI_NS_ns_35_44_yrs | M_PI_NS_ns_45_54_yrs | M_PI_NS_ns_55_64_yrs | M_PI_NS_ns_65_74_yrs | M_PI_NS_ns_75_84_yrs | M_PI_NS_ns_85_yrs_ovr | M_PI_NS_ns_Tot | M_Tot_15_19_yrs | M_Tot_20_24_yrs | M_Tot_25_34_yrs | M_Tot_35_44_yrs | M_Tot_45_54_yrs | M_Tot_55_64_yrs | M_Tot_65_74_yrs | M_Tot_75_84_yrs | M_Tot_85ov | M_Tot_Tot | F_Neg_Nil_income_15_19_yrs | F_Neg_Nil_income_20_24_yrs | F_Neg_Nil_income_25_34_yrs | F_Neg_Nil_income_35_44_yrs | F_Neg_Nil_income_45_54_yrs | F_Neg_Nil_income_55_64_yrs | F_Neg_Nil_income_65_74_yrs | F_Neg_Nil_income_75_84_yrs | F_Neg_Nil_incme_85_yrs_ovr | F_Neg_Nil_income_Tot | F_1_149_15_19_yrs | F_1_149_20_24_yrs | F_1_149_25_34_yrs | F_1_149_35_44_yrs | F_1_149_45_54_yrs | F_1_149_55_64_yrs | F_1_149_65_74_yrs | F_1_149_75_84_yrs | F_1_149_85ov | F_1_149_Tot | F_150_299_15_19_yrs | F_150_299_20_24_yrs | F_150_299_25_34_yrs | F_150_299_35_44_yrs | F_150_299_45_54_yrs | F_150_299_55_64_yrs | F_150_299_65_74_yrs | F_150_299_75_84_yrs | F_150_299_85ov | F_150_299_Tot | F_300_399_15_19_yrs | F_300_399_20_24_yrs | F_300_399_25_34_yrs | F_300_399_35_44_yrs | F_300_399_45_54_yrs | F_300_399_55_64_yrs | F_300_399_65_74_yrs | F_300_399_75_84_yrs | F_300_399_85ov | F_300_399_Tot |
---|
STE_CODE_2016 | M_Neg_Nil_income_15_19_yrs | M_Neg_Nil_income_20_24_yrs | M_Neg_Nil_income_25_34_yrs | M_Neg_Nil_income_35_44_yrs | M_Neg_Nil_income_45_54_yrs | M_Neg_Nil_income_55_64_yrs | M_Neg_Nil_income_65_74_yrs | M_Neg_Nil_income_75_84_yrs | M_Negtve_Nil_incme_85_yrs_ovr | M_Neg_Nil_income_Tot | M_1_149_15_19_yrs | M_1_149_20_24_yrs | M_1_149_25_34_yrs | M_1_149_35_44_yrs | M_1_149_45_54_yrs | M_1_149_55_64_yrs | M_1_149_65_74_yrs | M_1_149_75_84_yrs | M_1_149_85ov | M_1_149_Tot | M_150_299_15_19_yrs | M_150_299_20_24_yrs | M_150_299_25_34_yrs | M_150_299_35_44_yrs | M_150_299_45_54_yrs | M_150_299_55_64_yrs | M_150_299_65_74_yrs | M_150_299_75_84_yrs | M_150_299_85ov | M_150_299_Tot | M_300_399_15_19_yrs | M_300_399_20_24_yrs | M_300_399_25_34_yrs | M_300_399_35_44_yrs | M_300_399_45_54_yrs | M_300_399_55_64_yrs | M_300_399_65_74_yrs | M_300_399_75_84_yrs | M_300_399_85ov | M_300_399_Tot | M_400_499_15_19_yrs | M_400_499_20_24_yrs | M_400_499_25_34_yrs | M_400_499_35_44_yrs | M_400_499_45_54_yrs | M_400_499_55_64_yrs | M_400_499_65_74_yrs | M_400_499_75_84_yrs | M_400_499_85ov | M_400_499_Tot | M_500_649_15_19_yrs | M_500_649_20_24_yrs | M_500_649_25_34_yrs | M_500_649_35_44_yrs | M_500_649_45_54_yrs | M_500_649_55_64_yrs | M_500_649_65_74_yrs | M_500_649_75_84_yrs | M_500_649_85ov | M_500_649_Tot | M_650_799_15_19_yrs | M_650_799_20_24_yrs | M_650_799_25_34_yrs | M_650_799_35_44_yrs | M_650_799_45_54_yrs | M_650_799_55_64_yrs | M_650_799_65_74_yrs | M_650_799_75_84_yrs | M_650_799_85ov | M_650_799_Tot | M_800_999_15_19_yrs | M_800_999_20_24_yrs | M_800_999_25_34_yrs | M_800_999_35_44_yrs | M_800_999_45_54_yrs | M_800_999_55_64_yrs | M_800_999_65_74_yrs | M_800_999_75_84_yrs | M_800_999_85ov | M_800_999_Tot | M_1000_1249_15_19_yrs | M_1000_1249_20_24_yrs | M_1000_1249_25_34_yrs | M_1000_1249_35_44_yrs | M_1000_1249_45_54_yrs | M_1000_1249_55_64_yrs | M_1000_1249_65_74_yrs | M_1000_1249_75_84_yrs | M_1000_1249_85ov | M_1000_1249_Tot | M_1250_1499_15_19_yrs | M_1250_1499_20_24_yrs | M_1250_1499_25_34_yrs | M_1250_1499_35_44_yrs | M_1250_1499_45_54_yrs | M_1250_1499_55_64_yrs | M_1250_1499_65_74_yrs | M_1250_1499_75_84_yrs | M_1250_1499_85ov | M_1250_1499_Tot | M_1500_1749_15_19_yrs | M_1500_1749_20_24_yrs | M_1500_1749_25_34_yrs | M_1500_1749_35_44_yrs | M_1500_1749_45_54_yrs | M_1500_1749_55_64_yrs | M_1500_1749_65_74_yrs | M_1500_1749_75_84_yrs | M_1500_1749_85ov | M_1500_1749_Tot | M_1750_1999_15_19_yrs | M_1750_1999_20_24_yrs | M_1750_1999_25_34_yrs | M_1750_1999_35_44_yrs | M_1750_1999_45_54_yrs | M_1750_1999_55_64_yrs | M_1750_1999_65_74_yrs | M_1750_1999_75_84_yrs | M_1750_1999_85ov | M_1750_1999_Tot | M_2000_2999_15_19_yrs | M_2000_2999_20_24_yrs | M_2000_2999_25_34_yrs | M_2000_2999_35_44_yrs | M_2000_2999_45_54_yrs | M_2000_2999_55_64_yrs | M_2000_2999_65_74_yrs | M_2000_2999_75_84_yrs | M_2000_2999_85ov | M_2000_2999_Tot | M_3000_more_15_19_yrs | M_3000_more_20_24_yrs | M_3000_more_25_34_yrs | M_3000_more_35_44_yrs | M_3000_more_45_54_yrs | M_3000_more_55_64_yrs | M_3000_more_65_74_yrs | M_3000_more_75_84_yrs | M_3000_more_85ov | M_3000_more_Tot | M_PI_NS_15_19_yrs | M_PI_NS_ns_20_24_yrs | M_PI_NS_ns_25_34_yrs | M_PI_NS_ns_35_44_yrs | M_PI_NS_ns_45_54_yrs | M_PI_NS_ns_55_64_yrs | M_PI_NS_ns_65_74_yrs | M_PI_NS_ns_75_84_yrs | M_PI_NS_ns_85_yrs_ovr | M_PI_NS_ns_Tot | M_Tot_15_19_yrs | M_Tot_20_24_yrs | M_Tot_25_34_yrs | M_Tot_35_44_yrs | M_Tot_45_54_yrs | M_Tot_55_64_yrs | M_Tot_65_74_yrs | M_Tot_75_84_yrs | M_Tot_85ov | M_Tot_Tot | F_Neg_Nil_income_15_19_yrs | F_Neg_Nil_income_20_24_yrs | F_Neg_Nil_income_25_34_yrs | F_Neg_Nil_income_35_44_yrs | F_Neg_Nil_income_45_54_yrs | F_Neg_Nil_income_55_64_yrs | F_Neg_Nil_income_65_74_yrs | F_Neg_Nil_income_75_84_yrs | F_Neg_Nil_incme_85_yrs_ovr | F_Neg_Nil_income_Tot | F_1_149_15_19_yrs | F_1_149_20_24_yrs | F_1_149_25_34_yrs | F_1_149_35_44_yrs | F_1_149_45_54_yrs | F_1_149_55_64_yrs | F_1_149_65_74_yrs | F_1_149_75_84_yrs | F_1_149_85ov | F_1_149_Tot | F_150_299_15_19_yrs | F_150_299_20_24_yrs | F_150_299_25_34_yrs | F_150_299_35_44_yrs | F_150_299_45_54_yrs | F_150_299_55_64_yrs | F_150_299_65_74_yrs | F_150_299_75_84_yrs | F_150_299_85ov | F_150_299_Tot | F_300_399_15_19_yrs | F_300_399_20_24_yrs | F_300_399_25_34_yrs | F_300_399_35_44_yrs | F_300_399_45_54_yrs | F_300_399_55_64_yrs | F_300_399_65_74_yrs | F_300_399_75_84_yrs | F_300_399_85ov | F_300_399_Tot |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | 88338 | 31685 | 21321 | 12176 | 12700 | 16883 | 11502 | 4864 | 1736 | 201199 | 38027 | 15443 | 5314 | 3872 | 4598 | 6578 | 6248 | 2831 | 947 | 83859 | 14404 | 24502 | 18377 | 13035 | 14432 | 19362 | 21286 | 12944 | 4000 | 142347 | 6041 | 16083 | 15153 | 11440 | 14479 | 20680 | 43541 | 32914 | 10052 | 170390 | 6633 | 16767 | 17420 | 12871 | 15611 | 19490 | 31744 | 20256 | 8549 | 149345 | 5249 | 20317 | 23775 | 15826 | 16990 | 19775 | 24533 | 12661 | 4180 | 143307 | 2890 | 21927 | 38051 | 25091 | 24766 | 24018 | 18372 | 8652 | 2893 | 166657 | 1600 | 20837 | 56308 | 38378 | 37087 | 31737 | 16527 | 5393 | 1990 | 209859 | 672 | 14079 | 63881 | 47236 | 43346 | 35021 | 15012 | 4330 | 1529 | 225102 | 214 | 5767 | 45712 | 38351 | 33054 | 24126 | 8626 | 2113 | 683 | 158640 | 138 | 2598 | 34901 | 36477 | 31611 | 21816 | 6398 | 1549 | 538 | 136031 | 63 | 1085 | 21647 | 29005 | 24774 | 16129 | 4212 | 960 | 380 | 98252 | 116 | 951 | 28713 | 49459 | 41738 | 25319 | 6466 | 1558 | 535 | 154852 | 201 | 671 | 9675 | 31944 | 34203 | 20247 | 6749 | 1903 | 724 | 106312 | 17255 | 17031 | 36907 | 30837 | 29984 | 26386 | 23609 | 16507 | 8828 | 207345 | 181849 | 209733 | 437167 | 395979 | 379374 | 327567 | 244826 | 129451 | 47567 | 2353499 | 77647 | 31317 | 47176 | 39001 | 32724 | 39129 | 16906 | 6789 | 3159 | 293852 | 46359 | 17240 | 14080 | 15564 | 13261 | 14321 | 8753 | 3584 | 1493 | 134653 | 18099 | 28026 | 28760 | 27562 | 25839 | 32391 | 26881 | 15068 | 4998 | 207614 | 5983 | 18708 | 24559 | 25164 | 26693 | 33671 | 54764 | 34779 | 12164 | 236491 |
2016Census_G17B_VIC_STE.csv
STE_CODE_2016 | F_400_499_15_19_yrs | F_400_499_20_24_yrs | F_400_499_25_34_yrs | F_400_499_35_44_yrs | F_400_499_45_54_yrs | F_400_499_55_64_yrs | F_400_499_65_74_yrs | F_400_499_75_84_yrs | F_400_499_85ov | F_400_499_Tot | F_500_649_15_19_yrs | F_500_649_20_24_yrs | F_500_649_25_34_yrs | F_500_649_35_44_yrs | F_500_649_45_54_yrs | F_500_649_55_64_yrs | F_500_649_65_74_yrs | F_500_649_75_84_yrs | F_500_649_85ov | F_500_649_Tot | F_650_799_15_19_yrs | F_650_799_20_24_yrs | F_650_799_25_34_yrs | F_650_799_35_44_yrs | F_650_799_45_54_yrs | F_650_799_55_64_yrs | F_650_799_65_74_yrs | F_650_799_75_84_yrs | F_650_799_85ov | F_650_799_Tot | F_800_999_15_19_yrs | F_800_999_20_24_yrs | F_800_999_25_34_yrs | F_800_999_35_44_yrs | F_800_999_45_54_yrs | F_800_999_55_64_yrs | F_800_999_65_74_yrs | F_800_999_75_84_yrs | F_800_999_85ov | F_800_999_Tot | F_1000_1249_15_19_yrs | F_1000_1249_20_24_yrs | F_1000_1249_25_34_yrs | F_1000_1249_35_44_yrs | F_1000_1249_45_54_yrs | F_1000_1249_55_64_yrs | F_1000_1249_65_74_yrs | F_1000_1249_75_84_yrs | F_1000_1249_85ov | F_1000_1249_Tot | F_1250_1499_15_19_yrs | F_1250_1499_20_24_yrs | F_1250_1499_25_34_yrs | F_1250_1499_35_44_yrs | F_1250_1499_45_54_yrs | F_1250_1499_55_64_yrs | F_1250_1499_65_74_yrs | F_1250_1499_75_84_yrs | F_1250_1499_85ov | F_1250_1499_Tot | F_1500_1749_15_19_yrs | F_1500_1749_20_24_yrs | F_1500_1749_25_34_yrs | F_1500_1749_35_44_yrs | F_1500_1749_45_54_yrs | F_1500_1749_55_64_yrs | F_1500_1749_65_74_yrs | F_1500_1749_75_84_yrs | F_1500_1749_85ov | F_1500_1749_Tot | F_1750_1999_15_19_yrs | F_1750_1999_20_24_yrs | F_1750_1999_25_34_yrs | F_1750_1999_35_44_yrs | F_1750_1999_45_54_yrs | F_1750_1999_55_64_yrs | F_1750_1999_65_74_yrs | F_1750_1999_75_84_yrs | F_1750_1999_85ov | F_1750_1999_Tot | F_2000_2999_15_19_yrs | F_2000_2999_20_24_yrs | F_2000_2999_25_34_yrs | F_2000_2999_35_44_yrs | F_2000_2999_45_54_yrs | F_2000_2999_55_64_yrs | F_2000_2999_65_74_yrs | F_2000_2999_75_84_yrs | F_2000_2999_85ov | F_2000_2999_Tot | F_3000_more_15_19_yrs | F_3000_more_20_24_yrs | F_3000_more_25_34_yrs | F_3000_more_35_44_yrs | F_3000_more_45_54_yrs | F_3000_more_55_64_yrs | F_3000_more_65_74_yrs | F_3000_more_75_84_yrs | F_3000_more_85ov | F_3000_more_Tot | F_PI_NS_15_19_yrs | F_PI_NS_ns_20_24_yrs | F_PI_NS_ns_25_34_yrs | F_PI_NS_ns_35_44_yrs | F_PI_NS_ns_45_54_yrs | F_PI_NS_ns_55_64_yrs | F_PI_NS_ns_65_74_yrs | F_PI_NS_ns_75_84_yrs | F_PI_NS_ns_85_yrs_ovr | F_PI_NS_ns_Tot | F_Tot_15_19_yrs | F_Tot_20_24_yrs | F_Tot_25_34_yrs | F_Tot_35_44_yrs | F_Tot_45_54_yrs | F_Tot_55_64_yrs | F_Tot_65_74_yrs | F_Tot_75_84_yrs | F_Tot_85ov | F_Tot_Tot | P_Neg_Nil_income_15_19_yrs | P_Neg_Nil_income_20_24_yrs | P_Neg_Nil_income_25_34_yrs | P_Neg_Nil_income_35_44_yrs | P_Neg_Nil_income_45_54_yrs | P_Neg_Nil_income_55_64_yrs | P_Neg_Nil_income_65_74_yrs | P_Neg_Nil_income_75_84_yrs | P_Negtve_Nil_incme_85_yrs_ovr | P_Neg_Nil_income_Tot | P_1_149_15_19_yrs | P_1_149_20_24_yrs | P_1_149_25_34_yrs | P_1_149_35_44_yrs | P_1_149_45_54_yrs | P_1_149_55_64_yrs | P_1_149_65_74_yrs | P_1_149_75_84_yrs | P_1_149_85ov | P_1_149_Tot | P_150_299_15_19_yrs | P_150_299_20_24_yrs | P_150_299_25_34_yrs | P_150_299_35_44_yrs | P_150_299_45_54_yrs | P_150_299_55_64_yrs | P_150_299_65_74_yrs | P_150_299_75_84_yrs | P_150_299_85ov | P_150_299_Tot | P_300_399_15_19_yrs | P_300_399_20_24_yrs | P_300_399_25_34_yrs | P_300_399_35_44_yrs | P_300_399_45_54_yrs | P_300_399_55_64_yrs | P_300_399_65_74_yrs | P_300_399_75_84_yrs | P_300_399_85ov | P_300_399_Tot | P_400_499_15_19_yrs | P_400_499_20_24_yrs | P_400_499_25_34_yrs | P_400_499_35_44_yrs | P_400_499_45_54_yrs | P_400_499_55_64_yrs | P_400_499_65_74_yrs | P_400_499_75_84_yrs | P_400_499_85ov | P_400_499_Tot | P_500_649_15_19_yrs | P_500_649_20_24_yrs | P_500_649_25_34_yrs | P_500_649_35_44_yrs | P_500_649_45_54_yrs | P_500_649_55_64_yrs | P_500_649_65_74_yrs | P_500_649_75_84_yrs | P_500_649_85ov | P_500_649_Tot | P_650_799_15_19_yrs | P_650_799_20_24_yrs | P_650_799_25_34_yrs | P_650_799_35_44_yrs | P_650_799_45_54_yrs | P_650_799_55_64_yrs | P_650_799_65_74_yrs | P_650_799_75_84_yrs | P_650_799_85ov | P_650_799_Tot | P_800_999_15_19_yrs | P_800_999_20_24_yrs | P_800_999_25_34_yrs | P_800_999_35_44_yrs | P_800_999_45_54_yrs | P_800_999_55_64_yrs | P_800_999_65_74_yrs | P_800_999_75_84_yrs | P_800_999_85ov | P_800_999_Tot |
---|
STE_CODE_2016 | F_400_499_15_19_yrs | F_400_499_20_24_yrs | F_400_499_25_34_yrs | F_400_499_35_44_yrs | F_400_499_45_54_yrs | F_400_499_55_64_yrs | F_400_499_65_74_yrs | F_400_499_75_84_yrs | F_400_499_85ov | F_400_499_Tot | F_500_649_15_19_yrs | F_500_649_20_24_yrs | F_500_649_25_34_yrs | F_500_649_35_44_yrs | F_500_649_45_54_yrs | F_500_649_55_64_yrs | F_500_649_65_74_yrs | F_500_649_75_84_yrs | F_500_649_85ov | F_500_649_Tot | F_650_799_15_19_yrs | F_650_799_20_24_yrs | F_650_799_25_34_yrs | F_650_799_35_44_yrs | F_650_799_45_54_yrs | F_650_799_55_64_yrs | F_650_799_65_74_yrs | F_650_799_75_84_yrs | F_650_799_85ov | F_650_799_Tot | F_800_999_15_19_yrs | F_800_999_20_24_yrs | F_800_999_25_34_yrs | F_800_999_35_44_yrs | F_800_999_45_54_yrs | F_800_999_55_64_yrs | F_800_999_65_74_yrs | F_800_999_75_84_yrs | F_800_999_85ov | F_800_999_Tot | F_1000_1249_15_19_yrs | F_1000_1249_20_24_yrs | F_1000_1249_25_34_yrs | F_1000_1249_35_44_yrs | F_1000_1249_45_54_yrs | F_1000_1249_55_64_yrs | F_1000_1249_65_74_yrs | F_1000_1249_75_84_yrs | F_1000_1249_85ov | F_1000_1249_Tot | F_1250_1499_15_19_yrs | F_1250_1499_20_24_yrs | F_1250_1499_25_34_yrs | F_1250_1499_35_44_yrs | F_1250_1499_45_54_yrs | F_1250_1499_55_64_yrs | F_1250_1499_65_74_yrs | F_1250_1499_75_84_yrs | F_1250_1499_85ov | F_1250_1499_Tot | F_1500_1749_15_19_yrs | F_1500_1749_20_24_yrs | F_1500_1749_25_34_yrs | F_1500_1749_35_44_yrs | F_1500_1749_45_54_yrs | F_1500_1749_55_64_yrs | F_1500_1749_65_74_yrs | F_1500_1749_75_84_yrs | F_1500_1749_85ov | F_1500_1749_Tot | F_1750_1999_15_19_yrs | F_1750_1999_20_24_yrs | F_1750_1999_25_34_yrs | F_1750_1999_35_44_yrs | F_1750_1999_45_54_yrs | F_1750_1999_55_64_yrs | F_1750_1999_65_74_yrs | F_1750_1999_75_84_yrs | F_1750_1999_85ov | F_1750_1999_Tot | F_2000_2999_15_19_yrs | F_2000_2999_20_24_yrs | F_2000_2999_25_34_yrs | F_2000_2999_35_44_yrs | F_2000_2999_45_54_yrs | F_2000_2999_55_64_yrs | F_2000_2999_65_74_yrs | F_2000_2999_75_84_yrs | F_2000_2999_85ov | F_2000_2999_Tot | F_3000_more_15_19_yrs | F_3000_more_20_24_yrs | F_3000_more_25_34_yrs | F_3000_more_35_44_yrs | F_3000_more_45_54_yrs | F_3000_more_55_64_yrs | F_3000_more_65_74_yrs | F_3000_more_75_84_yrs | F_3000_more_85ov | F_3000_more_Tot | F_PI_NS_15_19_yrs | F_PI_NS_ns_20_24_yrs | F_PI_NS_ns_25_34_yrs | F_PI_NS_ns_35_44_yrs | F_PI_NS_ns_45_54_yrs | F_PI_NS_ns_55_64_yrs | F_PI_NS_ns_65_74_yrs | F_PI_NS_ns_75_84_yrs | F_PI_NS_ns_85_yrs_ovr | F_PI_NS_ns_Tot | F_Tot_15_19_yrs | F_Tot_20_24_yrs | F_Tot_25_34_yrs | F_Tot_35_44_yrs | F_Tot_45_54_yrs | F_Tot_55_64_yrs | F_Tot_65_74_yrs | F_Tot_75_84_yrs | F_Tot_85ov | F_Tot_Tot | P_Neg_Nil_income_15_19_yrs | P_Neg_Nil_income_20_24_yrs | P_Neg_Nil_income_25_34_yrs | P_Neg_Nil_income_35_44_yrs | P_Neg_Nil_income_45_54_yrs | P_Neg_Nil_income_55_64_yrs | P_Neg_Nil_income_65_74_yrs | P_Neg_Nil_income_75_84_yrs | P_Negtve_Nil_incme_85_yrs_ovr | P_Neg_Nil_income_Tot | P_1_149_15_19_yrs | P_1_149_20_24_yrs | P_1_149_25_34_yrs | P_1_149_35_44_yrs | P_1_149_45_54_yrs | P_1_149_55_64_yrs | P_1_149_65_74_yrs | P_1_149_75_84_yrs | P_1_149_85ov | P_1_149_Tot | P_150_299_15_19_yrs | P_150_299_20_24_yrs | P_150_299_25_34_yrs | P_150_299_35_44_yrs | P_150_299_45_54_yrs | P_150_299_55_64_yrs | P_150_299_65_74_yrs | P_150_299_75_84_yrs | P_150_299_85ov | P_150_299_Tot | P_300_399_15_19_yrs | P_300_399_20_24_yrs | P_300_399_25_34_yrs | P_300_399_35_44_yrs | P_300_399_45_54_yrs | P_300_399_55_64_yrs | P_300_399_65_74_yrs | P_300_399_75_84_yrs | P_300_399_85ov | P_300_399_Tot | P_400_499_15_19_yrs | P_400_499_20_24_yrs | P_400_499_25_34_yrs | P_400_499_35_44_yrs | P_400_499_45_54_yrs | P_400_499_55_64_yrs | P_400_499_65_74_yrs | P_400_499_75_84_yrs | P_400_499_85ov | P_400_499_Tot | P_500_649_15_19_yrs | P_500_649_20_24_yrs | P_500_649_25_34_yrs | P_500_649_35_44_yrs | P_500_649_45_54_yrs | P_500_649_55_64_yrs | P_500_649_65_74_yrs | P_500_649_75_84_yrs | P_500_649_85ov | P_500_649_Tot | P_650_799_15_19_yrs | P_650_799_20_24_yrs | P_650_799_25_34_yrs | P_650_799_35_44_yrs | P_650_799_45_54_yrs | P_650_799_55_64_yrs | P_650_799_65_74_yrs | P_650_799_75_84_yrs | P_650_799_85ov | P_650_799_Tot | P_800_999_15_19_yrs | P_800_999_20_24_yrs | P_800_999_25_34_yrs | P_800_999_35_44_yrs | P_800_999_45_54_yrs | P_800_999_55_64_yrs | P_800_999_65_74_yrs | P_800_999_75_84_yrs | P_800_999_85ov | P_800_999_Tot |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | 4020 | 17474 | 26607 | 26466 | 29789 | 31568 | 47499 | 37154 | 21386 | 241956 | 3205 | 20235 | 37882 | 36319 | 37225 | 31226 | 28445 | 14558 | 8426 | 217522 | 1810 | 20111 | 42785 | 37279 | 38889 | 29372 | 17729 | 7881 | 3589 | 199446 | 867 | 17972 | 49988 | 37838 | 39970 | 28577 | 12049 | 4675 | 2420 | 194360 | 392 | 11564 | 54077 | 36905 | 37977 | 26801 | 9151 | 3324 | 1761 | 181940 | 96 | 3658 | 38495 | 27640 | 25747 | 16926 | 4799 | 1356 | 740 | 119460 | 71 | 1125 | 24977 | 23872 | 21781 | 13900 | 3405 | 1079 | 577 | 90793 | 40 | 395 | 11895 | 15600 | 14463 | 9176 | 2189 | 731 | 364 | 54847 | 63 | 328 | 13073 | 21020 | 17329 | 9786 | 2806 | 1014 | 525 | 65948 | 183 | 375 | 3511 | 11690 | 11238 | 5823 | 2728 | 1318 | 691 | 37568 | 15667 | 15527 | 34151 | 28009 | 28119 | 27218 | 26679 | 22253 | 18139 | 215766 | 174492 | 204065 | 452031 | 409936 | 401040 | 349886 | 264772 | 155554 | 80427 | 2492203 | 165978 | 63007 | 68500 | 51179 | 45422 | 56016 | 28406 | 11651 | 4892 | 495052 | 84384 | 32685 | 19394 | 19436 | 17858 | 20896 | 14999 | 6416 | 2440 | 218506 | 32505 | 52526 | 47141 | 40596 | 40273 | 51755 | 48162 | 28014 | 8996 | 349958 | 12023 | 34790 | 39718 | 36611 | 41179 | 54351 | 98309 | 67688 | 22219 | 406878 | 10653 | 34240 | 44035 | 39330 | 45405 | 51060 | 79246 | 57407 | 29935 | 391308 | 8450 | 40552 | 61664 | 52138 | 54212 | 51004 | 52981 | 27216 | 12610 | 360838 | 4702 | 42035 | 80835 | 62373 | 63652 | 53390 | 36103 | 16532 | 6478 | 366105 | 2466 | 38808 | 106298 | 76216 | 77055 | 60313 | 28575 | 10077 | 4407 | 404215 |
2016Census_G17C_VIC_STE.csv
STE_CODE_2016 | P_1000_1249_15_19_yrs | P_1000_1249_20_24_yrs | P_1000_1249_25_34_yrs | P_1000_1249_35_44_yrs | P_1000_1249_45_54_yrs | P_1000_1249_55_64_yrs | P_1000_1249_65_74_yrs | P_1000_1249_75_84_yrs | P_1000_1249_85ov | P_1000_1249_Tot | P_1250_1499_15_19_yrs | P_1250_1499_20_24_yrs | P_1250_1499_25_34_yrs | P_1250_1499_35_44_yrs | P_1250_1499_45_54_yrs | P_1250_1499_55_64_yrs | P_1250_1499_65_74_yrs | P_1250_1499_75_84_yrs | P_1250_1499_85ov | P_1250_1499_Tot | P_1500_1749_15_19_yrs | P_1500_1749_20_24_yrs | P_1500_1749_25_34_yrs | P_1500_1749_35_44_yrs | P_1500_1749_45_54_yrs | P_1500_1749_55_64_yrs | P_1500_1749_65_74_yrs | P_1500_1749_75_84_yrs | P_1500_1749_85ov | P_1500_1749_Tot | P_1750_1999_15_19_yrs | P_1750_1999_20_24_yrs | P_1750_1999_25_34_yrs | P_1750_1999_35_44_yrs | P_1750_1999_45_54_yrs | P_1750_1999_55_64_yrs | P_1750_1999_65_74_yrs | P_1750_1999_75_84_yrs | P_1750_1999_85ov | P_1750_1999_Tot | P_2000_2999_15_19_yrs | P_2000_2999_20_24_yrs | P_2000_2999_25_34_yrs | P_2000_2999_35_44_yrs | P_2000_2999_45_54_yrs | P_2000_2999_55_64_yrs | P_2000_2999_65_74_yrs | P_2000_2999_75_84_yrs | P_2000_2999_85ov | P_2000_2999_Tot | P_3000_more_15_19_yrs | P_3000_more_20_24_yrs | P_3000_more_25_34_yrs | P_3000_more_35_44_yrs | P_3000_more_45_54_yrs | P_3000_more_55_64_yrs | P_3000_more_65_74_yrs | P_3000_more_75_84_yrs | P_3000_more_85ov | P_3000_more_Tot | P_PI_NS_15_19_yrs | P_PI_NS_ns_20_24_yrs | P_PI_NS_ns_25_34_yrs | P_PI_NS_ns_35_44_yrs | P_PI_NS_ns_45_54_yrs | P_PI_NS_ns_55_64_yrs | P_PI_NS_ns_65_74_yrs | P_PI_NS_ns_75_84_yrs | P_PI_NS_ns_85_yrs_ovr | P_PI_NS_ns_Tot | P_Tot_15_19_yrs | P_Tot_20_24_yrs | P_Tot_25_34_yrs | P_Tot_35_44_yrs | P_Tot_45_54_yrs | P_Tot_55_64_yrs | P_Tot_65_74_yrs | P_Tot_75_84_yrs | P_Tot_85ov | P_Tot_Tot |
---|
STE_CODE_2016 | P_1000_1249_15_19_yrs | P_1000_1249_20_24_yrs | P_1000_1249_25_34_yrs | P_1000_1249_35_44_yrs | P_1000_1249_45_54_yrs | P_1000_1249_55_64_yrs | P_1000_1249_65_74_yrs | P_1000_1249_75_84_yrs | P_1000_1249_85ov | P_1000_1249_Tot | P_1250_1499_15_19_yrs | P_1250_1499_20_24_yrs | P_1250_1499_25_34_yrs | P_1250_1499_35_44_yrs | P_1250_1499_45_54_yrs | P_1250_1499_55_64_yrs | P_1250_1499_65_74_yrs | P_1250_1499_75_84_yrs | P_1250_1499_85ov | P_1250_1499_Tot | P_1500_1749_15_19_yrs | P_1500_1749_20_24_yrs | P_1500_1749_25_34_yrs | P_1500_1749_35_44_yrs | P_1500_1749_45_54_yrs | P_1500_1749_55_64_yrs | P_1500_1749_65_74_yrs | P_1500_1749_75_84_yrs | P_1500_1749_85ov | P_1500_1749_Tot | P_1750_1999_15_19_yrs | P_1750_1999_20_24_yrs | P_1750_1999_25_34_yrs | P_1750_1999_35_44_yrs | P_1750_1999_45_54_yrs | P_1750_1999_55_64_yrs | P_1750_1999_65_74_yrs | P_1750_1999_75_84_yrs | P_1750_1999_85ov | P_1750_1999_Tot | P_2000_2999_15_19_yrs | P_2000_2999_20_24_yrs | P_2000_2999_25_34_yrs | P_2000_2999_35_44_yrs | P_2000_2999_45_54_yrs | P_2000_2999_55_64_yrs | P_2000_2999_65_74_yrs | P_2000_2999_75_84_yrs | P_2000_2999_85ov | P_2000_2999_Tot | P_3000_more_15_19_yrs | P_3000_more_20_24_yrs | P_3000_more_25_34_yrs | P_3000_more_35_44_yrs | P_3000_more_45_54_yrs | P_3000_more_55_64_yrs | P_3000_more_65_74_yrs | P_3000_more_75_84_yrs | P_3000_more_85ov | P_3000_more_Tot | P_PI_NS_15_19_yrs | P_PI_NS_ns_20_24_yrs | P_PI_NS_ns_25_34_yrs | P_PI_NS_ns_35_44_yrs | P_PI_NS_ns_45_54_yrs | P_PI_NS_ns_55_64_yrs | P_PI_NS_ns_65_74_yrs | P_PI_NS_ns_75_84_yrs | P_PI_NS_ns_85_yrs_ovr | P_PI_NS_ns_Tot | P_Tot_15_19_yrs | P_Tot_20_24_yrs | P_Tot_25_34_yrs | P_Tot_35_44_yrs | P_Tot_45_54_yrs | P_Tot_55_64_yrs | P_Tot_65_74_yrs | P_Tot_75_84_yrs | P_Tot_85ov | P_Tot_Tot |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | 1061 | 25642 | 117956 | 84132 | 81324 | 61821 | 24164 | 7657 | 3287 | 407041 | 311 | 9424 | 84206 | 65993 | 58799 | 41049 | 13420 | 3469 | 1422 | 278098 | 210 | 3720 | 59880 | 60349 | 53396 | 35718 | 9803 | 2624 | 1115 | 226824 | 103 | 1480 | 33544 | 44600 | 39237 | 25306 | 6403 | 1687 | 741 | 153095 | 174 | 1279 | 41788 | 70485 | 59071 | 35105 | 9266 | 2575 | 1061 | 220801 | 382 | 1044 | 13185 | 43637 | 45438 | 26071 | 9480 | 3222 | 1417 | 143877 | 32924 | 32554 | 71062 | 58843 | 58102 | 53603 | 50289 | 38765 | 26966 | 423108 | 356340 | 413792 | 889190 | 805920 | 780420 | 677453 | 509599 | 285006 | 127993 | 4845710 |
There are few things to note:
But what does the data show?
Wickham (2014) Tidy Data. Journal of Statistical Software 59
So what about the ABS 2016 Census Data?
F_400_499_15_19_yrs
is female aged 15-19 years old who earn $400-499 per week (in Victoria).Wickham (2014) Tidy Data. Journal of Statistical Software 59
age_min | age_max | gender | income_min | income_max | count |
---|---|---|---|---|---|
15 | 19 | female | 400 | 499 | 4020 |
You can include other information, e.g. geography code (useful if combining with other geographical area) or average age/income.
Note that some don't have upper bounds, e.g. M_3000_more_85ov
. In R, -Inf
and Inf
are used to represent −∞ and ∞, respectively.
You'll wrangle the data into the tidy form in tutorial
stringr
package is powered by the stringi
package which in turn uses the ICU C library to provide fast peformance for string manipulationlibrary(tidyverse) # includes `stringr`
Hadley Wickham (2019). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.4.0.
Gagolewski M. and others (2020). R package stringi: Character string processing facilities.
stringr
package is powered by the stringi
package which in turn uses the ICU C library to provide fast peformance for string manipulationlibrary(tidyverse) # includes `stringr`
Hadley Wickham (2019). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.4.0.
Gagolewski M. and others (2020). R package stringi: Character string processing facilities.
stringr
prefix with str_
(stringi
prefix with stri_
) and the first argument is string (or a vector of strings)stringr
package is powered by the stringi
package which in turn uses the ICU C library to provide fast peformance for string manipulationlibrary(tidyverse) # includes `stringr`
Hadley Wickham (2019). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.4.0.
Gagolewski M. and others (2020). R package stringi: Character string processing facilities.
stringr
prefix with str_
(stringi
prefix with stri_
) and the first argument is string (or a vector of strings)str_trim
and str_squish
do?str_trim(c(" Apple ", " Goji Berry "))
## [1] "Apple" "Goji Berry"
str_squish(c(" Apple ", " Goji Berry "))
## [1] "Apple" "Goji Berry"
stringr
Base R | stringr |
---|---|
gregexpr(pattern, x) | str_locate_all(x, pattern) |
grep(pattern, x, value = TRUE) | str_subset(x, pattern) |
grep(pattern, x) | str_which(x, pattern) |
grepl(pattern, x) | str_detect(x, pattern) |
gsub(pattern, replacement, x) | str_replace_all(x, pattern, replacement) |
nchar(x) | str_length(x) |
order(x) | str_order(x) |
regexec(pattern, x) + regmatches() | str_match(x, pattern) |
regexpr(pattern, x) + regmatches() | str_extract(x, pattern) |
regexpr(pattern, x) | str_locate(x, pattern) |
stringr
?stringr
?stringr
?Base R
paste0("Area", "1", c("A", "B"))
## [1] "Area1A" "Area1B"
stringr
str_c("Area", "1", c("A", "B"))
## [1] "Area1A" "Area1B"
stringr
?There are a number of considerations to ensure there is consistency in syntax and user expectation (both for input and output)
For example, let's consider combining multiple strings into one.
Base R
paste0("Area", "1", c("A", "B"))
## [1] "Area1A" "Area1B"
paste0("Area", "1", c("A", NA, "C"))
stringr
str_c("Area", "1", c("A", "B"))
## [1] "Area1A" "Area1B"
str_c("Area", "1", c("A", NA, "C"))
stringr
?There are a number of considerations to ensure there is consistency in syntax and user expectation (both for input and output)
For example, let's consider combining multiple strings into one.
Base R
paste0("Area", "1", c("A", "B"))
## [1] "Area1A" "Area1B"
paste0("Area", "1", c("A", NA, "C"))
## [1] "Area1A" "Area1NA" "Area1C"
stringr
str_c("Area", "1", c("A", "B"))
## [1] "Area1A" "Area1B"
str_c("Area", "1", c("A", NA, "C"))
## [1] "Area1A" NA "Area1C"
If the Base R result is preferable then NA can be replaced with character with str_replace_na
("A", NA, "C")
first
LGA <- ozmaps::abs_lga %>% pull(NAME)LGA[1:7]
## [1] "Broken Hill (C)" "Waroona (S)" "Toowoomba (R)" "West Arthur (S)"## [5] "Moreton Bay (R)" "Etheridge (S)" "Cleve (DC)"
C = Cities | A = Areas | RC = Rural Cities |
B = Boroughs | S = Shires | DC = District Councils |
M = Municipalities | T = Towns | AC = Aboriginal Councils |
RegC = Regional Councils |
🎯 Extract the LGA status from the LGA names
Michael Sumner (2020). ozmaps: Australia Maps. R package version 0.3.6.
LGA <- ozmaps::abs_lga %>% pull(NAME)LGA[1:7]
## [1] "Broken Hill (C)" "Waroona (S)" "Toowoomba (R)" "West Arthur (S)"## [5] "Moreton Bay (R)" "Etheridge (S)" "Cleve (DC)"
C = Cities | A = Areas | RC = Rural Cities |
B = Boroughs | S = Shires | DC = District Councils |
M = Municipalities | T = Towns | AC = Aboriginal Councils |
RegC = Regional Councils |
🎯 Extract the LGA status from the LGA names
How?
Michael Sumner (2020). ozmaps: Australia Maps. R package version 0.3.6.
str_extract(LGA, "\\(.+\\)")
## [1] "(C)" "(S)" "(R)" "(S)" "(R)" ## [6] "(S)" "(DC)" "(R)" "(DC)" "(C)" ## [11] "(DC)" "(S)" "(S)" "(S)" "(DC)" ## [16] "(A)" "(C)" "(A)" "(T)" "(RC)" ## [21] "(A)" "(S)" "(S)" "(S)" "(C)" ## [26] "(DC)" "(R)" "(A)" "(C)" "(DC)" ## [31] "(S)" "(S)" "(A)" "(S)" "(S)" ## [36] "(R)" "(M)" "(A)" "(C)" "(S)" ## [41] "(S)" "(C)" "(A)" "(S)" "(C)" ## [46] "(AC)" "(A)" "(S)" "(A)" "(C)" ## [51] "(A)" "(R)" "(S)" "(T)" "(C)" ## [56] "(S)" "(S)" "(R)" "(C)" "(T)" ## [61] "(C)" "(S)" "(C)" "(C)" "(C)" ## [66] "(C)" "(S)" "(DC)" "(DC)" "(S)" ## [71] "(R)" "(R)" "(S)" "(B)" "(DC)" ## [76] "(M)" "(A)" "(C)" "(S)" "(S)" ## [81] "(S)" "(S)" "(S)" "(S)" "(S)" ## [86] "(C)" "(A)" "(C)" "(A)" "(S)" ## [91] "(C)" "(A)" "(S)" "(S)" "(S)" ## [96] "(S)" "(DC)" "(S)" "(S)" "(S)" ## [101] "(C)" "(C)" "(DC)" "(S)" "(S)" ## [106] "(C)" "(S)" "(DC)" "(C)" "(C)" ## [111] "(S)" "(S)" "(S)" "(S)" "(S)" ## [116] "(S)" "(A)" "(DC)" "(S)" "(A)" ## [121] "(C)" "(A)" "(S)" "(A)" "(DC)" ## [126] "(S)" "(C)" "(S)" "(A)" "(S)" ## [131] "(M)" "(S)" "(DC)" "(R)" "(C)" ## [136] "(C)" "(S)" "(C)" "(S)" "(T)" ## [141] "(S)" "(S)" "(DC)" "(S)" "(T)" ## [146] "(C)" "(S)" "(M)" "(S)" "(DC)" ## [151] "(C)" "(S)" "(M)" "(C)" "(S)" ## [156] "(C)" "(C)" "(R)" "(S)" "(C)" ## [161] "(C)" "(R)" "(S)" "(C)" "(A)" ## [166] "(T)" "(S)" "(RC)" "(C)" "(A)" ## [171] "(A)" "(A)" "(S)" "(A)" "(S)" ## [176] "(S)" "(T)" "(S)" "(S)" "(S)" ## [181] "(A)" "(DC)" "(M)" "(C)" "(S)" ## [186] "(A)" "(T)" "(A)" "(C)" "(S)" ## [191] "(C)" "(R)" "(C)" "(S)" "(S)" ## [196] "(S)" "(S)" "(R)" "(C)" "(DC)" ## [201] "(A)" "(DC)" "(R)" "(C)" "(S)" ## [206] "(S)" "(C)" "(C)" "(R)" "(S)" ## [211] "(S)" "(C)" "(A)" "(S)" "(S)" ## [216] "(C)" "(DC)" "(S)" "(M) (Tas.)" "(M) (Tas.)"## [221] "(C) (Vic.)" "(C) (Vic.)" "(S)" "(DC)" "(S)" ## [226] "(RC)" "(S)" "(DC)" "(S)" "(S)" ## [231] "(R)" "(S)" "(A)" "(C)" "(C)" ## [236] "(A)" "(A)" "(RC)" "(S)" "(C)" ## [241] "(S)" "(S)" "(S)" "(C)" "(C)" ## [246] "(S)" "(C)" "(C)" "(C)" "(A)" ## [251] "(C)" "(S)" "(S)" "(S)" "(S)" ## [256] "(S)" "(A)" "(A)" "(A)" "(S)" ## [261] "(A)" "(A)" "(S)" "(S)" "(C)" ## [266] "(A)" "(M)" "(S)" "(S)" "(C)" ## [271] "(R)" "(S)" "(R)" "(DC)" "(R)" ## [276] "(C)" "(S)" "(S)" "(C)" "(S)" ## [281] "(A)" "(R)" "(DC)" "(A)" "(C)" ## [286] "(A)" "(S)" "(S)" "(A)" "(C)" ## [291] "(C)" "(A)" "(T)" "(S)" "(C)" ## [296] "(A)" "(A)" "(S)" "(S)" "(T)" ## [301] "(C)" "(A)" "(A)" "(DC)" "(A)" ## [306] "(C)" "(M)" "(M)" "(S)" "(A)" ## [311] "(A)" "(C)" "(C)" "(S)" "(DC)" ## [316] "(S)" "(C)" "(S)" "(S)" "(DC)" ## [321] "(RegC)" "(C)" "(S)" "(S)" NA ## [326] "(A)" "(S)" "(A)" "(S)" "(A)" ## [331] "(S)" "(C)" "(R)" "(C)" "(S)" ## [336] "(A)" "(DC)" "(S)" "(A)" "(R)" ## [341] "(S)" "(S)" "(RC)" "(T)" "(A)" ## [346] "(M)" "(A)" "(S)" "(S)" "(S)" ## [351] "(S)" "(A)" "(RC)" "(S)" "(A)" ## [356] "(R)" "(S)" "(S)" "(C)" "(S)" ## [361] "(DC)" "(M)" "(M)" "(AC)" "(DC)" ## [366] "(A)" "(A)" "(S)" "(S)" "(A)" ## [371] "(C)" "(S)" "(S)" "(C)" "(R)" ## [376] "(S)" "(S)" NA "(A)" "(T)" ## [381] "(S)" "(A)" "(C)" "(C)" "(A)" ## [386] "(C)" "(DC)" "(C)" "(A)" "(A)" ## [391] "(A)" "(S)" "(DC)" "(DC)" "(S)" ## [396] "(M)" "(R)" "(DC)" "(C)" "(S)" ## [401] "(S)" "(C)" "(C)" "(C)" "(C)" ## [406] "(C)" "(S)" "(A)" NA "(S)" ## [411] "(C)" "(S)" "(M)" "(C)" "(S)" ## [416] "(S)" NA "(C)" "(S)" "(C)" ## [421] "(DC)" "(S)" "(C)" "(S)" "(C)" ## [426] "(M)" "(A)" "(A)" "(A)" "(S)" ## [431] "(C)" "(S)" "(S)" "(S)" "(A)" ## [436] "(A)" "(A)" "(S)" "(S)" "(S)" ## [441] "(C)" "(S)" "(C)" "(C)" "(C)" ## [446] "(C) (NSW)" "(S) (Qld)" "(R) (Qld)" "(DC) (SA)" "(C) (SA)" ## [451] "(M) (Tas.)" "(M) (Tas.)" "(C)" "(R)" "(M)" ## [456] "(C)" "(R)" "(S)" "(RC)" "(S)" ## [461] "(M)" "(C)" "(R)" "(C)" "(DC)" ## [466] "(C)" "(C)" "(M)" "(C)" "(S)" ## [471] "(C)" "(DC)" "(M)" "(S)" "(C)" ## [476] "(C)" "(A)" "(DC)" "(R)" "(C)" ## [481] "(C)" "(A)" "(M)" "(C)" "(C)" ## [486] "(S)" "(S)" "(S)" "(A)" "(R)" ## [491] "(M)" "(A)" "(R)" "(A)" "(A)" ## [496] "(R)" "(R)" "(R)" "(S)" "(C)" ## [501] "(C)" "(S)" "(A)" "(S)" "(M)" ## [506] "(M)" "(S)" "(A)" "(A)" "(S)" ## [511] "(A)" "(C)" "(DC)" "(S)" "(S)" ## [516] NA "(A)" NA "(R)" "(C)" ## [521] "(S)" "(C)" "(S)" "(A)" "(A)" ## [526] "(A)" "(A)" "(C)" "(A)" "(A)" ## [531] "(A)" "(A)" "(C) (NSW)" "(A)" "(C)" ## [536] "(R)" "(S)" "(A)" "(R)" "(C)" ## [541] "(A)" "(S)" "(A)" "(A)"
str_extract(LGA, "\\(.+\\)")
## [1] "(C)" "(S)" "(R)" "(S)" "(R)" ## [6] "(S)" "(DC)" "(R)" "(DC)" "(C)" ## [11] "(DC)" "(S)" "(S)" "(S)" "(DC)" ## [16] "(A)" "(C)" "(A)" "(T)" "(RC)" ## [21] "(A)" "(S)" "(S)" "(S)" "(C)" ## [26] "(DC)" "(R)" "(A)" "(C)" "(DC)" ## [31] "(S)" "(S)" "(A)" "(S)" "(S)" ## [36] "(R)" "(M)" "(A)" "(C)" "(S)" ## [41] "(S)" "(C)" "(A)" "(S)" "(C)" ## [46] "(AC)" "(A)" "(S)" "(A)" "(C)" ## [51] "(A)" "(R)" "(S)" "(T)" "(C)" ## [56] "(S)" "(S)" "(R)" "(C)" "(T)" ## [61] "(C)" "(S)" "(C)" "(C)" "(C)" ## [66] "(C)" "(S)" "(DC)" "(DC)" "(S)" ## [71] "(R)" "(R)" "(S)" "(B)" "(DC)" ## [76] "(M)" "(A)" "(C)" "(S)" "(S)" ## [81] "(S)" "(S)" "(S)" "(S)" "(S)" ## [86] "(C)" "(A)" "(C)" "(A)" "(S)" ## [91] "(C)" "(A)" "(S)" "(S)" "(S)" ## [96] "(S)" "(DC)" "(S)" "(S)" "(S)" ## [101] "(C)" "(C)" "(DC)" "(S)" "(S)" ## [106] "(C)" "(S)" "(DC)" "(C)" "(C)" ## [111] "(S)" "(S)" "(S)" "(S)" "(S)" ## [116] "(S)" "(A)" "(DC)" "(S)" "(A)" ## [121] "(C)" "(A)" "(S)" "(A)" "(DC)" ## [126] "(S)" "(C)" "(S)" "(A)" "(S)" ## [131] "(M)" "(S)" "(DC)" "(R)" "(C)" ## [136] "(C)" "(S)" "(C)" "(S)" "(T)" ## [141] "(S)" "(S)" "(DC)" "(S)" "(T)" ## [146] "(C)" "(S)" "(M)" "(S)" "(DC)" ## [151] "(C)" "(S)" "(M)" "(C)" "(S)" ## [156] "(C)" "(C)" "(R)" "(S)" "(C)" ## [161] "(C)" "(R)" "(S)" "(C)" "(A)" ## [166] "(T)" "(S)" "(RC)" "(C)" "(A)" ## [171] "(A)" "(A)" "(S)" "(A)" "(S)" ## [176] "(S)" "(T)" "(S)" "(S)" "(S)" ## [181] "(A)" "(DC)" "(M)" "(C)" "(S)" ## [186] "(A)" "(T)" "(A)" "(C)" "(S)" ## [191] "(C)" "(R)" "(C)" "(S)" "(S)" ## [196] "(S)" "(S)" "(R)" "(C)" "(DC)" ## [201] "(A)" "(DC)" "(R)" "(C)" "(S)" ## [206] "(S)" "(C)" "(C)" "(R)" "(S)" ## [211] "(S)" "(C)" "(A)" "(S)" "(S)" ## [216] "(C)" "(DC)" "(S)" "(M) (Tas.)" "(M) (Tas.)"## [221] "(C) (Vic.)" "(C) (Vic.)" "(S)" "(DC)" "(S)" ## [226] "(RC)" "(S)" "(DC)" "(S)" "(S)" ## [231] "(R)" "(S)" "(A)" "(C)" "(C)" ## [236] "(A)" "(A)" "(RC)" "(S)" "(C)" ## [241] "(S)" "(S)" "(S)" "(C)" "(C)" ## [246] "(S)" "(C)" "(C)" "(C)" "(A)" ## [251] "(C)" "(S)" "(S)" "(S)" "(S)" ## [256] "(S)" "(A)" "(A)" "(A)" "(S)" ## [261] "(A)" "(A)" "(S)" "(S)" "(C)" ## [266] "(A)" "(M)" "(S)" "(S)" "(C)" ## [271] "(R)" "(S)" "(R)" "(DC)" "(R)" ## [276] "(C)" "(S)" "(S)" "(C)" "(S)" ## [281] "(A)" "(R)" "(DC)" "(A)" "(C)" ## [286] "(A)" "(S)" "(S)" "(A)" "(C)" ## [291] "(C)" "(A)" "(T)" "(S)" "(C)" ## [296] "(A)" "(A)" "(S)" "(S)" "(T)" ## [301] "(C)" "(A)" "(A)" "(DC)" "(A)" ## [306] "(C)" "(M)" "(M)" "(S)" "(A)" ## [311] "(A)" "(C)" "(C)" "(S)" "(DC)" ## [316] "(S)" "(C)" "(S)" "(S)" "(DC)" ## [321] "(RegC)" "(C)" "(S)" "(S)" NA ## [326] "(A)" "(S)" "(A)" "(S)" "(A)" ## [331] "(S)" "(C)" "(R)" "(C)" "(S)" ## [336] "(A)" "(DC)" "(S)" "(A)" "(R)" ## [341] "(S)" "(S)" "(RC)" "(T)" "(A)" ## [346] "(M)" "(A)" "(S)" "(S)" "(S)" ## [351] "(S)" "(A)" "(RC)" "(S)" "(A)" ## [356] "(R)" "(S)" "(S)" "(C)" "(S)" ## [361] "(DC)" "(M)" "(M)" "(AC)" "(DC)" ## [366] "(A)" "(A)" "(S)" "(S)" "(A)" ## [371] "(C)" "(S)" "(S)" "(C)" "(R)" ## [376] "(S)" "(S)" NA "(A)" "(T)" ## [381] "(S)" "(A)" "(C)" "(C)" "(A)" ## [386] "(C)" "(DC)" "(C)" "(A)" "(A)" ## [391] "(A)" "(S)" "(DC)" "(DC)" "(S)" ## [396] "(M)" "(R)" "(DC)" "(C)" "(S)" ## [401] "(S)" "(C)" "(C)" "(C)" "(C)" ## [406] "(C)" "(S)" "(A)" NA "(S)" ## [411] "(C)" "(S)" "(M)" "(C)" "(S)" ## [416] "(S)" NA "(C)" "(S)" "(C)" ## [421] "(DC)" "(S)" "(C)" "(S)" "(C)" ## [426] "(M)" "(A)" "(A)" "(A)" "(S)" ## [431] "(C)" "(S)" "(S)" "(S)" "(A)" ## [436] "(A)" "(A)" "(S)" "(S)" "(S)" ## [441] "(C)" "(S)" "(C)" "(C)" "(C)" ## [446] "(C) (NSW)" "(S) (Qld)" "(R) (Qld)" "(DC) (SA)" "(C) (SA)" ## [451] "(M) (Tas.)" "(M) (Tas.)" "(C)" "(R)" "(M)" ## [456] "(C)" "(R)" "(S)" "(RC)" "(S)" ## [461] "(M)" "(C)" "(R)" "(C)" "(DC)" ## [466] "(C)" "(C)" "(M)" "(C)" "(S)" ## [471] "(C)" "(DC)" "(M)" "(S)" "(C)" ## [476] "(C)" "(A)" "(DC)" "(R)" "(C)" ## [481] "(C)" "(A)" "(M)" "(C)" "(C)" ## [486] "(S)" "(S)" "(S)" "(A)" "(R)" ## [491] "(M)" "(A)" "(R)" "(A)" "(A)" ## [496] "(R)" "(R)" "(R)" "(S)" "(C)" ## [501] "(C)" "(S)" "(A)" "(S)" "(M)" ## [506] "(M)" "(S)" "(A)" "(A)" "(S)" ## [511] "(A)" "(C)" "(DC)" "(S)" "(S)" ## [516] NA "(A)" NA "(R)" "(C)" ## [521] "(S)" "(C)" "(S)" "(A)" "(A)" ## [526] "(A)" "(A)" "(C)" "(A)" "(A)" ## [531] "(A)" "(A)" "(C) (NSW)" "(A)" "(C)" ## [536] "(R)" "(S)" "(A)" "(R)" "(C)" ## [541] "(A)" "(S)" "(A)" "(A)"
"\\(.+\\)"
???str_extract(LGA, "\\(.+\\)")
## [1] "(C)" "(S)" "(R)" "(S)" "(R)" ## [6] "(S)" "(DC)" "(R)" "(DC)" "(C)" ## [11] "(DC)" "(S)" "(S)" "(S)" "(DC)" ## [16] "(A)" "(C)" "(A)" "(T)" "(RC)" ## [21] "(A)" "(S)" "(S)" "(S)" "(C)" ## [26] "(DC)" "(R)" "(A)" "(C)" "(DC)" ## [31] "(S)" "(S)" "(A)" "(S)" "(S)" ## [36] "(R)" "(M)" "(A)" "(C)" "(S)" ## [41] "(S)" "(C)" "(A)" "(S)" "(C)" ## [46] "(AC)" "(A)" "(S)" "(A)" "(C)" ## [51] "(A)" "(R)" "(S)" "(T)" "(C)" ## [56] "(S)" "(S)" "(R)" "(C)" "(T)" ## [61] "(C)" "(S)" "(C)" "(C)" "(C)" ## [66] "(C)" "(S)" "(DC)" "(DC)" "(S)" ## [71] "(R)" "(R)" "(S)" "(B)" "(DC)" ## [76] "(M)" "(A)" "(C)" "(S)" "(S)" ## [81] "(S)" "(S)" "(S)" "(S)" "(S)" ## [86] "(C)" "(A)" "(C)" "(A)" "(S)" ## [91] "(C)" "(A)" "(S)" "(S)" "(S)" ## [96] "(S)" "(DC)" "(S)" "(S)" "(S)" ## [101] "(C)" "(C)" "(DC)" "(S)" "(S)" ## [106] "(C)" "(S)" "(DC)" "(C)" "(C)" ## [111] "(S)" "(S)" "(S)" "(S)" "(S)" ## [116] "(S)" "(A)" "(DC)" "(S)" "(A)" ## [121] "(C)" "(A)" "(S)" "(A)" "(DC)" ## [126] "(S)" "(C)" "(S)" "(A)" "(S)" ## [131] "(M)" "(S)" "(DC)" "(R)" "(C)" ## [136] "(C)" "(S)" "(C)" "(S)" "(T)" ## [141] "(S)" "(S)" "(DC)" "(S)" "(T)" ## [146] "(C)" "(S)" "(M)" "(S)" "(DC)" ## [151] "(C)" "(S)" "(M)" "(C)" "(S)" ## [156] "(C)" "(C)" "(R)" "(S)" "(C)" ## [161] "(C)" "(R)" "(S)" "(C)" "(A)" ## [166] "(T)" "(S)" "(RC)" "(C)" "(A)" ## [171] "(A)" "(A)" "(S)" "(A)" "(S)" ## [176] "(S)" "(T)" "(S)" "(S)" "(S)" ## [181] "(A)" "(DC)" "(M)" "(C)" "(S)" ## [186] "(A)" "(T)" "(A)" "(C)" "(S)" ## [191] "(C)" "(R)" "(C)" "(S)" "(S)" ## [196] "(S)" "(S)" "(R)" "(C)" "(DC)" ## [201] "(A)" "(DC)" "(R)" "(C)" "(S)" ## [206] "(S)" "(C)" "(C)" "(R)" "(S)" ## [211] "(S)" "(C)" "(A)" "(S)" "(S)" ## [216] "(C)" "(DC)" "(S)" "(M) (Tas.)" "(M) (Tas.)"## [221] "(C) (Vic.)" "(C) (Vic.)" "(S)" "(DC)" "(S)" ## [226] "(RC)" "(S)" "(DC)" "(S)" "(S)" ## [231] "(R)" "(S)" "(A)" "(C)" "(C)" ## [236] "(A)" "(A)" "(RC)" "(S)" "(C)" ## [241] "(S)" "(S)" "(S)" "(C)" "(C)" ## [246] "(S)" "(C)" "(C)" "(C)" "(A)" ## [251] "(C)" "(S)" "(S)" "(S)" "(S)" ## [256] "(S)" "(A)" "(A)" "(A)" "(S)" ## [261] "(A)" "(A)" "(S)" "(S)" "(C)" ## [266] "(A)" "(M)" "(S)" "(S)" "(C)" ## [271] "(R)" "(S)" "(R)" "(DC)" "(R)" ## [276] "(C)" "(S)" "(S)" "(C)" "(S)" ## [281] "(A)" "(R)" "(DC)" "(A)" "(C)" ## [286] "(A)" "(S)" "(S)" "(A)" "(C)" ## [291] "(C)" "(A)" "(T)" "(S)" "(C)" ## [296] "(A)" "(A)" "(S)" "(S)" "(T)" ## [301] "(C)" "(A)" "(A)" "(DC)" "(A)" ## [306] "(C)" "(M)" "(M)" "(S)" "(A)" ## [311] "(A)" "(C)" "(C)" "(S)" "(DC)" ## [316] "(S)" "(C)" "(S)" "(S)" "(DC)" ## [321] "(RegC)" "(C)" "(S)" "(S)" NA ## [326] "(A)" "(S)" "(A)" "(S)" "(A)" ## [331] "(S)" "(C)" "(R)" "(C)" "(S)" ## [336] "(A)" "(DC)" "(S)" "(A)" "(R)" ## [341] "(S)" "(S)" "(RC)" "(T)" "(A)" ## [346] "(M)" "(A)" "(S)" "(S)" "(S)" ## [351] "(S)" "(A)" "(RC)" "(S)" "(A)" ## [356] "(R)" "(S)" "(S)" "(C)" "(S)" ## [361] "(DC)" "(M)" "(M)" "(AC)" "(DC)" ## [366] "(A)" "(A)" "(S)" "(S)" "(A)" ## [371] "(C)" "(S)" "(S)" "(C)" "(R)" ## [376] "(S)" "(S)" NA "(A)" "(T)" ## [381] "(S)" "(A)" "(C)" "(C)" "(A)" ## [386] "(C)" "(DC)" "(C)" "(A)" "(A)" ## [391] "(A)" "(S)" "(DC)" "(DC)" "(S)" ## [396] "(M)" "(R)" "(DC)" "(C)" "(S)" ## [401] "(S)" "(C)" "(C)" "(C)" "(C)" ## [406] "(C)" "(S)" "(A)" NA "(S)" ## [411] "(C)" "(S)" "(M)" "(C)" "(S)" ## [416] "(S)" NA "(C)" "(S)" "(C)" ## [421] "(DC)" "(S)" "(C)" "(S)" "(C)" ## [426] "(M)" "(A)" "(A)" "(A)" "(S)" ## [431] "(C)" "(S)" "(S)" "(S)" "(A)" ## [436] "(A)" "(A)" "(S)" "(S)" "(S)" ## [441] "(C)" "(S)" "(C)" "(C)" "(C)" ## [446] "(C) (NSW)" "(S) (Qld)" "(R) (Qld)" "(DC) (SA)" "(C) (SA)" ## [451] "(M) (Tas.)" "(M) (Tas.)" "(C)" "(R)" "(M)" ## [456] "(C)" "(R)" "(S)" "(RC)" "(S)" ## [461] "(M)" "(C)" "(R)" "(C)" "(DC)" ## [466] "(C)" "(C)" "(M)" "(C)" "(S)" ## [471] "(C)" "(DC)" "(M)" "(S)" "(C)" ## [476] "(C)" "(A)" "(DC)" "(R)" "(C)" ## [481] "(C)" "(A)" "(M)" "(C)" "(C)" ## [486] "(S)" "(S)" "(S)" "(A)" "(R)" ## [491] "(M)" "(A)" "(R)" "(A)" "(A)" ## [496] "(R)" "(R)" "(R)" "(S)" "(C)" ## [501] "(C)" "(S)" "(A)" "(S)" "(M)" ## [506] "(M)" "(S)" "(A)" "(A)" "(S)" ## [511] "(A)" "(C)" "(DC)" "(S)" "(S)" ## [516] NA "(A)" NA "(R)" "(C)" ## [521] "(S)" "(C)" "(S)" "(A)" "(A)" ## [526] "(A)" "(A)" "(C)" "(A)" "(A)" ## [531] "(A)" "(A)" "(C) (NSW)" "(A)" "(C)" ## [536] "(R)" "(S)" "(A)" "(R)" "(C)" ## [541] "(A)" "(S)" "(A)" "(A)"
"\\(.+\\)"
???str_extract(LGA, "\\(.+\\)")
## [1] "(C)" "(S)" "(R)" "(S)" "(R)" ## [6] "(S)" "(DC)" "(R)" "(DC)" "(C)" ## [11] "(DC)" "(S)" "(S)" "(S)" "(DC)" ## [16] "(A)" "(C)" "(A)" "(T)" "(RC)" ## [21] "(A)" "(S)" "(S)" "(S)" "(C)" ## [26] "(DC)" "(R)" "(A)" "(C)" "(DC)" ## [31] "(S)" "(S)" "(A)" "(S)" "(S)" ## [36] "(R)" "(M)" "(A)" "(C)" "(S)" ## [41] "(S)" "(C)" "(A)" "(S)" "(C)" ## [46] "(AC)" "(A)" "(S)" "(A)" "(C)" ## [51] "(A)" "(R)" "(S)" "(T)" "(C)" ## [56] "(S)" "(S)" "(R)" "(C)" "(T)" ## [61] "(C)" "(S)" "(C)" "(C)" "(C)" ## [66] "(C)" "(S)" "(DC)" "(DC)" "(S)" ## [71] "(R)" "(R)" "(S)" "(B)" "(DC)" ## [76] "(M)" "(A)" "(C)" "(S)" "(S)" ## [81] "(S)" "(S)" "(S)" "(S)" "(S)" ## [86] "(C)" "(A)" "(C)" "(A)" "(S)" ## [91] "(C)" "(A)" "(S)" "(S)" "(S)" ## [96] "(S)" "(DC)" "(S)" "(S)" "(S)" ## [101] "(C)" "(C)" "(DC)" "(S)" "(S)" ## [106] "(C)" "(S)" "(DC)" "(C)" "(C)" ## [111] "(S)" "(S)" "(S)" "(S)" "(S)" ## [116] "(S)" "(A)" "(DC)" "(S)" "(A)" ## [121] "(C)" "(A)" "(S)" "(A)" "(DC)" ## [126] "(S)" "(C)" "(S)" "(A)" "(S)" ## [131] "(M)" "(S)" "(DC)" "(R)" "(C)" ## [136] "(C)" "(S)" "(C)" "(S)" "(T)" ## [141] "(S)" "(S)" "(DC)" "(S)" "(T)" ## [146] "(C)" "(S)" "(M)" "(S)" "(DC)" ## [151] "(C)" "(S)" "(M)" "(C)" "(S)" ## [156] "(C)" "(C)" "(R)" "(S)" "(C)" ## [161] "(C)" "(R)" "(S)" "(C)" "(A)" ## [166] "(T)" "(S)" "(RC)" "(C)" "(A)" ## [171] "(A)" "(A)" "(S)" "(A)" "(S)" ## [176] "(S)" "(T)" "(S)" "(S)" "(S)" ## [181] "(A)" "(DC)" "(M)" "(C)" "(S)" ## [186] "(A)" "(T)" "(A)" "(C)" "(S)" ## [191] "(C)" "(R)" "(C)" "(S)" "(S)" ## [196] "(S)" "(S)" "(R)" "(C)" "(DC)" ## [201] "(A)" "(DC)" "(R)" "(C)" "(S)" ## [206] "(S)" "(C)" "(C)" "(R)" "(S)" ## [211] "(S)" "(C)" "(A)" "(S)" "(S)" ## [216] "(C)" "(DC)" "(S)" "(M) (Tas.)" "(M) (Tas.)"## [221] "(C) (Vic.)" "(C) (Vic.)" "(S)" "(DC)" "(S)" ## [226] "(RC)" "(S)" "(DC)" "(S)" "(S)" ## [231] "(R)" "(S)" "(A)" "(C)" "(C)" ## [236] "(A)" "(A)" "(RC)" "(S)" "(C)" ## [241] "(S)" "(S)" "(S)" "(C)" "(C)" ## [246] "(S)" "(C)" "(C)" "(C)" "(A)" ## [251] "(C)" "(S)" "(S)" "(S)" "(S)" ## [256] "(S)" "(A)" "(A)" "(A)" "(S)" ## [261] "(A)" "(A)" "(S)" "(S)" "(C)" ## [266] "(A)" "(M)" "(S)" "(S)" "(C)" ## [271] "(R)" "(S)" "(R)" "(DC)" "(R)" ## [276] "(C)" "(S)" "(S)" "(C)" "(S)" ## [281] "(A)" "(R)" "(DC)" "(A)" "(C)" ## [286] "(A)" "(S)" "(S)" "(A)" "(C)" ## [291] "(C)" "(A)" "(T)" "(S)" "(C)" ## [296] "(A)" "(A)" "(S)" "(S)" "(T)" ## [301] "(C)" "(A)" "(A)" "(DC)" "(A)" ## [306] "(C)" "(M)" "(M)" "(S)" "(A)" ## [311] "(A)" "(C)" "(C)" "(S)" "(DC)" ## [316] "(S)" "(C)" "(S)" "(S)" "(DC)" ## [321] "(RegC)" "(C)" "(S)" "(S)" NA ## [326] "(A)" "(S)" "(A)" "(S)" "(A)" ## [331] "(S)" "(C)" "(R)" "(C)" "(S)" ## [336] "(A)" "(DC)" "(S)" "(A)" "(R)" ## [341] "(S)" "(S)" "(RC)" "(T)" "(A)" ## [346] "(M)" "(A)" "(S)" "(S)" "(S)" ## [351] "(S)" "(A)" "(RC)" "(S)" "(A)" ## [356] "(R)" "(S)" "(S)" "(C)" "(S)" ## [361] "(DC)" "(M)" "(M)" "(AC)" "(DC)" ## [366] "(A)" "(A)" "(S)" "(S)" "(A)" ## [371] "(C)" "(S)" "(S)" "(C)" "(R)" ## [376] "(S)" "(S)" NA "(A)" "(T)" ## [381] "(S)" "(A)" "(C)" "(C)" "(A)" ## [386] "(C)" "(DC)" "(C)" "(A)" "(A)" ## [391] "(A)" "(S)" "(DC)" "(DC)" "(S)" ## [396] "(M)" "(R)" "(DC)" "(C)" "(S)" ## [401] "(S)" "(C)" "(C)" "(C)" "(C)" ## [406] "(C)" "(S)" "(A)" NA "(S)" ## [411] "(C)" "(S)" "(M)" "(C)" "(S)" ## [416] "(S)" NA "(C)" "(S)" "(C)" ## [421] "(DC)" "(S)" "(C)" "(S)" "(C)" ## [426] "(M)" "(A)" "(A)" "(A)" "(S)" ## [431] "(C)" "(S)" "(S)" "(S)" "(A)" ## [436] "(A)" "(A)" "(S)" "(S)" "(S)" ## [441] "(C)" "(S)" "(C)" "(C)" "(C)" ## [446] "(C) (NSW)" "(S) (Qld)" "(R) (Qld)" "(DC) (SA)" "(C) (SA)" ## [451] "(M) (Tas.)" "(M) (Tas.)" "(C)" "(R)" "(M)" ## [456] "(C)" "(R)" "(S)" "(RC)" "(S)" ## [461] "(M)" "(C)" "(R)" "(C)" "(DC)" ## [466] "(C)" "(C)" "(M)" "(C)" "(S)" ## [471] "(C)" "(DC)" "(M)" "(S)" "(C)" ## [476] "(C)" "(A)" "(DC)" "(R)" "(C)" ## [481] "(C)" "(A)" "(M)" "(C)" "(C)" ## [486] "(S)" "(S)" "(S)" "(A)" "(R)" ## [491] "(M)" "(A)" "(R)" "(A)" "(A)" ## [496] "(R)" "(R)" "(R)" "(S)" "(C)" ## [501] "(C)" "(S)" "(A)" "(S)" "(M)" ## [506] "(M)" "(S)" "(A)" "(A)" "(S)" ## [511] "(A)" "(C)" "(DC)" "(S)" "(S)" ## [516] NA "(A)" NA "(R)" "(C)" ## [521] "(S)" "(C)" "(S)" "(A)" "(A)" ## [526] "(A)" "(A)" "(C)" "(A)" "(A)" ## [531] "(A)" "(A)" "(C) (NSW)" "(A)" "(C)" ## [536] "(R)" "(S)" "(A)" "(R)" "(C)" ## [541] "(A)" "(S)" "(A)" "(A)"
"\\(.+\\)"
???\
when \
is included in the pattern (yes this means that you can have a lot of backslashes... just keep adding \
until it works! Enjoy this xkcd comic.)str_extract(LGA, "\\(.+\\)")
## [1] "(C)" "(S)" "(R)" "(S)" "(R)" ## [6] "(S)" "(DC)" "(R)" "(DC)" "(C)" ## [11] "(DC)" "(S)" "(S)" "(S)" "(DC)" ## [16] "(A)" "(C)" "(A)" "(T)" "(RC)" ## [21] "(A)" "(S)" "(S)" "(S)" "(C)" ## [26] "(DC)" "(R)" "(A)" "(C)" "(DC)" ## [31] "(S)" "(S)" "(A)" "(S)" "(S)" ## [36] "(R)" "(M)" "(A)" "(C)" "(S)" ## [41] "(S)" "(C)" "(A)" "(S)" "(C)" ## [46] "(AC)" "(A)" "(S)" "(A)" "(C)" ## [51] "(A)" "(R)" "(S)" "(T)" "(C)" ## [56] "(S)" "(S)" "(R)" "(C)" "(T)" ## [61] "(C)" "(S)" "(C)" "(C)" "(C)" ## [66] "(C)" "(S)" "(DC)" "(DC)" "(S)" ## [71] "(R)" "(R)" "(S)" "(B)" "(DC)" ## [76] "(M)" "(A)" "(C)" "(S)" "(S)" ## [81] "(S)" "(S)" "(S)" "(S)" "(S)" ## [86] "(C)" "(A)" "(C)" "(A)" "(S)" ## [91] "(C)" "(A)" "(S)" "(S)" "(S)" ## [96] "(S)" "(DC)" "(S)" "(S)" "(S)" ## [101] "(C)" "(C)" "(DC)" "(S)" "(S)" ## [106] "(C)" "(S)" "(DC)" "(C)" "(C)" ## [111] "(S)" "(S)" "(S)" "(S)" "(S)" ## [116] "(S)" "(A)" "(DC)" "(S)" "(A)" ## [121] "(C)" "(A)" "(S)" "(A)" "(DC)" ## [126] "(S)" "(C)" "(S)" "(A)" "(S)" ## [131] "(M)" "(S)" "(DC)" "(R)" "(C)" ## [136] "(C)" "(S)" "(C)" "(S)" "(T)" ## [141] "(S)" "(S)" "(DC)" "(S)" "(T)" ## [146] "(C)" "(S)" "(M)" "(S)" "(DC)" ## [151] "(C)" "(S)" "(M)" "(C)" "(S)" ## [156] "(C)" "(C)" "(R)" "(S)" "(C)" ## [161] "(C)" "(R)" "(S)" "(C)" "(A)" ## [166] "(T)" "(S)" "(RC)" "(C)" "(A)" ## [171] "(A)" "(A)" "(S)" "(A)" "(S)" ## [176] "(S)" "(T)" "(S)" "(S)" "(S)" ## [181] "(A)" "(DC)" "(M)" "(C)" "(S)" ## [186] "(A)" "(T)" "(A)" "(C)" "(S)" ## [191] "(C)" "(R)" "(C)" "(S)" "(S)" ## [196] "(S)" "(S)" "(R)" "(C)" "(DC)" ## [201] "(A)" "(DC)" "(R)" "(C)" "(S)" ## [206] "(S)" "(C)" "(C)" "(R)" "(S)" ## [211] "(S)" "(C)" "(A)" "(S)" "(S)" ## [216] "(C)" "(DC)" "(S)" "(M) (Tas.)" "(M) (Tas.)"## [221] "(C) (Vic.)" "(C) (Vic.)" "(S)" "(DC)" "(S)" ## [226] "(RC)" "(S)" "(DC)" "(S)" "(S)" ## [231] "(R)" "(S)" "(A)" "(C)" "(C)" ## [236] "(A)" "(A)" "(RC)" "(S)" "(C)" ## [241] "(S)" "(S)" "(S)" "(C)" "(C)" ## [246] "(S)" "(C)" "(C)" "(C)" "(A)" ## [251] "(C)" "(S)" "(S)" "(S)" "(S)" ## [256] "(S)" "(A)" "(A)" "(A)" "(S)" ## [261] "(A)" "(A)" "(S)" "(S)" "(C)" ## [266] "(A)" "(M)" "(S)" "(S)" "(C)" ## [271] "(R)" "(S)" "(R)" "(DC)" "(R)" ## [276] "(C)" "(S)" "(S)" "(C)" "(S)" ## [281] "(A)" "(R)" "(DC)" "(A)" "(C)" ## [286] "(A)" "(S)" "(S)" "(A)" "(C)" ## [291] "(C)" "(A)" "(T)" "(S)" "(C)" ## [296] "(A)" "(A)" "(S)" "(S)" "(T)" ## [301] "(C)" "(A)" "(A)" "(DC)" "(A)" ## [306] "(C)" "(M)" "(M)" "(S)" "(A)" ## [311] "(A)" "(C)" "(C)" "(S)" "(DC)" ## [316] "(S)" "(C)" "(S)" "(S)" "(DC)" ## [321] "(RegC)" "(C)" "(S)" "(S)" NA ## [326] "(A)" "(S)" "(A)" "(S)" "(A)" ## [331] "(S)" "(C)" "(R)" "(C)" "(S)" ## [336] "(A)" "(DC)" "(S)" "(A)" "(R)" ## [341] "(S)" "(S)" "(RC)" "(T)" "(A)" ## [346] "(M)" "(A)" "(S)" "(S)" "(S)" ## [351] "(S)" "(A)" "(RC)" "(S)" "(A)" ## [356] "(R)" "(S)" "(S)" "(C)" "(S)" ## [361] "(DC)" "(M)" "(M)" "(AC)" "(DC)" ## [366] "(A)" "(A)" "(S)" "(S)" "(A)" ## [371] "(C)" "(S)" "(S)" "(C)" "(R)" ## [376] "(S)" "(S)" NA "(A)" "(T)" ## [381] "(S)" "(A)" "(C)" "(C)" "(A)" ## [386] "(C)" "(DC)" "(C)" "(A)" "(A)" ## [391] "(A)" "(S)" "(DC)" "(DC)" "(S)" ## [396] "(M)" "(R)" "(DC)" "(C)" "(S)" ## [401] "(S)" "(C)" "(C)" "(C)" "(C)" ## [406] "(C)" "(S)" "(A)" NA "(S)" ## [411] "(C)" "(S)" "(M)" "(C)" "(S)" ## [416] "(S)" NA "(C)" "(S)" "(C)" ## [421] "(DC)" "(S)" "(C)" "(S)" "(C)" ## [426] "(M)" "(A)" "(A)" "(A)" "(S)" ## [431] "(C)" "(S)" "(S)" "(S)" "(A)" ## [436] "(A)" "(A)" "(S)" "(S)" "(S)" ## [441] "(C)" "(S)" "(C)" "(C)" "(C)" ## [446] "(C) (NSW)" "(S) (Qld)" "(R) (Qld)" "(DC) (SA)" "(C) (SA)" ## [451] "(M) (Tas.)" "(M) (Tas.)" "(C)" "(R)" "(M)" ## [456] "(C)" "(R)" "(S)" "(RC)" "(S)" ## [461] "(M)" "(C)" "(R)" "(C)" "(DC)" ## [466] "(C)" "(C)" "(M)" "(C)" "(S)" ## [471] "(C)" "(DC)" "(M)" "(S)" "(C)" ## [476] "(C)" "(A)" "(DC)" "(R)" "(C)" ## [481] "(C)" "(A)" "(M)" "(C)" "(C)" ## [486] "(S)" "(S)" "(S)" "(A)" "(R)" ## [491] "(M)" "(A)" "(R)" "(A)" "(A)" ## [496] "(R)" "(R)" "(R)" "(S)" "(C)" ## [501] "(C)" "(S)" "(A)" "(S)" "(M)" ## [506] "(M)" "(S)" "(A)" "(A)" "(S)" ## [511] "(A)" "(C)" "(DC)" "(S)" "(S)" ## [516] NA "(A)" NA "(R)" "(C)" ## [521] "(S)" "(C)" "(S)" "(A)" "(A)" ## [526] "(A)" "(A)" "(C)" "(A)" "(A)" ## [531] "(A)" "(A)" "(C) (NSW)" "(A)" "(C)" ## [536] "(R)" "(S)" "(A)" "(R)" "(C)" ## [541] "(A)" "(S)" "(A)" "(A)"
"\\(.+\\)"
???\
when \
is included in the pattern (yes this means that you can have a lot of backslashes... just keep adding \
until it works! Enjoy this xkcd comic.)\
, e.g. r"(\(.+\))"
is the same as "\\(.+\\)"
ozanimals <- c("koala", "kangaroo", "kookaburra", "numbat")
ozanimals <- c("koala", "kangaroo", "kookaburra", "numbat")
= Basic match
str_detect(ozanimals, "oo")
## [1] FALSE TRUE TRUE FALSE
str_extract(ozanimals, "oo")
## [1] NA "oo" "oo" NA
str_match(ozanimals, "oo")
## [,1]## [1,] NA ## [2,] "oo"## [3,] "oo"## [4,] NA
= Meta-characters
"."
a wildcard to match any character except a new linestr_starts(c("color", "colouur", "colour", "red-column"), "col...")
## [1] FALSE TRUE TRUE FALSE
= Meta-characters
"."
a wildcard to match any character except a new linestr_starts(c("color", "colouur", "colour", "red-column"), "col...")
## [1] FALSE TRUE TRUE FALSE
"(.|.)"
a marked subexpression with alternate possibilites marked with |
str_replace(c("lovelove", "move", "stove", "drove"), "(l|dr|st)o", "ha")
## [1] "havelove" "move" "have" "have"
= Meta-characters
"."
a wildcard to match any character except a new linestr_starts(c("color", "colouur", "colour", "red-column"), "col...")
## [1] FALSE TRUE TRUE FALSE
"(.|.)"
a marked subexpression with alternate possibilites marked with |
str_replace(c("lovelove", "move", "stove", "drove"), "(l|dr|st)o", "ha")
## [1] "havelove" "move" "have" "have"
"[...]"
matches a single character contained in the bracket str_replace_all(c("cake", "cookie", "lamington"), "[aeiou]", "_")
## [1] "c_k_" "c__k__" "l_m_ngt_n"
= Meta-character quantifiers
"?"
zero or one occurence of preceding elementstr_extract(c("color", "colouur", "colour", "red"), "colou?r")
## [1] "color" NA "colour" NA
= Meta-character quantifiers
"?"
zero or one occurence of preceding elementstr_extract(c("color", "colouur", "colour", "red"), "colou?r")
## [1] "color" NA "colour" NA
"*"
zero or more occurence of preceding elementstr_extract(c("color", "colouur", "colour", "red"), "colou*r")
## [1] "color" "colouur" "colour" NA
= Meta-character quantifiers
"?"
zero or one occurence of preceding elementstr_extract(c("color", "colouur", "colour", "red"), "colou?r")
## [1] "color" NA "colour" NA
"*"
zero or more occurence of preceding elementstr_extract(c("color", "colouur", "colour", "red"), "colou*r")
## [1] "color" "colouur" "colour" NA
"+"
one or more occurence of preceding elementstr_extract(c("color", "colouur", "colour", "red"), "colou+r")
## [1] NA "colouur" "colour" NA
"{n}"
preceding element is matched exactly n
timesstr_replace(c("banana", "bananana", "bana", "banananana"), "ba(na){2}", "-")
## [1] "-" "-na" "bana" "-nana"
"{n}"
preceding element is matched exactly n
timesstr_replace(c("banana", "bananana", "bana", "banananana"), "ba(na){2}", "-")
## [1] "-" "-na" "bana" "-nana"
"{min,}"
preceding element is matched min
times or morestr_replace(c("banana", "bananana", "bana", "banananana"), "ba(na){2,}", "-")
## [1] "-" "-" "bana" "-"
"{n}"
preceding element is matched exactly n
timesstr_replace(c("banana", "bananana", "bana", "banananana"), "ba(na){2}", "-")
## [1] "-" "-na" "bana" "-nana"
"{min,}"
preceding element is matched min
times or morestr_replace(c("banana", "bananana", "bana", "banananana"), "ba(na){2,}", "-")
## [1] "-" "-" "bana" "-"
"{min,max}"
preceding element is matched at least min
times but no more than max
timesstr_replace(c("banana", "bananana", "bana", "banananana"), "ba(na){1,2}", "-")
## [1] "-" "-na" "-" "-nana"
= Character classes
[:alpha:]
or [A-Za-z]
to match alphabetic characters[:alnum:]
or [A-Za-z0-9]
to match alphanumeric characters[:digit:]
or [0-9]
or \\d
to match a digit[^0-9]
to match non-digits [a-c]
to match a, b or c[A-Z]
to match uppercase letters[a-z]
to match lowercase letters[:space:]
or [ \t\r\n\v\f]
to match whitespace charactersstr_view(c("banana", "bananana", "bana", "banabanana"), "ba(na){1,2}")
str_view_all(c("banana", "bananana", "bana", "banabanana"), "ba(na){1,2}")
str_view(c("banana", "bananana", "bana", "banabanana"), "ba(na){1,2}")
str_view_all(c("banana", "bananana", "bana", "banabanana"), "ba(na){1,2}")
stringr
ends with _all
, all matches of the pattern are considered_all
only considers the first matchstr_extract(LGA, "\\(.+\\)")
## [1] "(C)" "(S)" "(R)" "(S)" "(R)" ## [6] "(S)" "(DC)" "(R)" "(DC)" "(C)" ## [11] "(DC)" "(S)" "(S)" "(S)" "(DC)" ## [16] "(A)" "(C)" "(A)" "(T)" "(RC)" ## [21] "(A)" "(S)" "(S)" "(S)" "(C)" ## [26] "(DC)" "(R)" "(A)" "(C)" "(DC)" ## [31] "(S)" "(S)" "(A)" "(S)" "(S)" ## [36] "(R)" "(M)" "(A)" "(C)" "(S)" ## [41] "(S)" "(C)" "(A)" "(S)" "(C)" ## [46] "(AC)" "(A)" "(S)" "(A)" "(C)" ## [51] "(A)" "(R)" "(S)" "(T)" "(C)" ## [56] "(S)" "(S)" "(R)" "(C)" "(T)" ## [61] "(C)" "(S)" "(C)" "(C)" "(C)" ## [66] "(C)" "(S)" "(DC)" "(DC)" "(S)" ## [71] "(R)" "(R)" "(S)" "(B)" "(DC)" ## [76] "(M)" "(A)" "(C)" "(S)" "(S)" ## [81] "(S)" "(S)" "(S)" "(S)" "(S)" ## [86] "(C)" "(A)" "(C)" "(A)" "(S)" ## [91] "(C)" "(A)" "(S)" "(S)" "(S)" ## [96] "(S)" "(DC)" "(S)" "(S)" "(S)" ## [101] "(C)" "(C)" "(DC)" "(S)" "(S)" ## [106] "(C)" "(S)" "(DC)" "(C)" "(C)" ## [111] "(S)" "(S)" "(S)" "(S)" "(S)" ## [116] "(S)" "(A)" "(DC)" "(S)" "(A)" ## [121] "(C)" "(A)" "(S)" "(A)" "(DC)" ## [126] "(S)" "(C)" "(S)" "(A)" "(S)" ## [131] "(M)" "(S)" "(DC)" "(R)" "(C)" ## [136] "(C)" "(S)" "(C)" "(S)" "(T)" ## [141] "(S)" "(S)" "(DC)" "(S)" "(T)" ## [146] "(C)" "(S)" "(M)" "(S)" "(DC)" ## [151] "(C)" "(S)" "(M)" "(C)" "(S)" ## [156] "(C)" "(C)" "(R)" "(S)" "(C)" ## [161] "(C)" "(R)" "(S)" "(C)" "(A)" ## [166] "(T)" "(S)" "(RC)" "(C)" "(A)" ## [171] "(A)" "(A)" "(S)" "(A)" "(S)" ## [176] "(S)" "(T)" "(S)" "(S)" "(S)" ## [181] "(A)" "(DC)" "(M)" "(C)" "(S)" ## [186] "(A)" "(T)" "(A)" "(C)" "(S)" ## [191] "(C)" "(R)" "(C)" "(S)" "(S)" ## [196] "(S)" "(S)" "(R)" "(C)" "(DC)" ## [201] "(A)" "(DC)" "(R)" "(C)" "(S)" ## [206] "(S)" "(C)" "(C)" "(R)" "(S)" ## [211] "(S)" "(C)" "(A)" "(S)" "(S)" ## [216] "(C)" "(DC)" "(S)" "(M) (Tas.)" "(M) (Tas.)"## [221] "(C) (Vic.)" "(C) (Vic.)" "(S)" "(DC)" "(S)" ## [226] "(RC)" "(S)" "(DC)" "(S)" "(S)" ## [231] "(R)" "(S)" "(A)" "(C)" "(C)" ## [236] "(A)" "(A)" "(RC)" "(S)" "(C)" ## [241] "(S)" "(S)" "(S)" "(C)" "(C)" ## [246] "(S)" "(C)" "(C)" "(C)" "(A)" ## [251] "(C)" "(S)" "(S)" "(S)" "(S)" ## [256] "(S)" "(A)" "(A)" "(A)" "(S)" ## [261] "(A)" "(A)" "(S)" "(S)" "(C)" ## [266] "(A)" "(M)" "(S)" "(S)" "(C)" ## [271] "(R)" "(S)" "(R)" "(DC)" "(R)" ## [276] "(C)" "(S)" "(S)" "(C)" "(S)" ## [281] "(A)" "(R)" "(DC)" "(A)" "(C)" ## [286] "(A)" "(S)" "(S)" "(A)" "(C)" ## [291] "(C)" "(A)" "(T)" "(S)" "(C)" ## [296] "(A)" "(A)" "(S)" "(S)" "(T)" ## [301] "(C)" "(A)" "(A)" "(DC)" "(A)" ## [306] "(C)" "(M)" "(M)" "(S)" "(A)" ## [311] "(A)" "(C)" "(C)" "(S)" "(DC)" ## [316] "(S)" "(C)" "(S)" "(S)" "(DC)" ## [321] "(RegC)" "(C)" "(S)" "(S)" NA ## [326] "(A)" "(S)" "(A)" "(S)" "(A)" ## [331] "(S)" "(C)" "(R)" "(C)" "(S)" ## [336] "(A)" "(DC)" "(S)" "(A)" "(R)" ## [341] "(S)" "(S)" "(RC)" "(T)" "(A)" ## [346] "(M)" "(A)" "(S)" "(S)" "(S)" ## [351] "(S)" "(A)" "(RC)" "(S)" "(A)" ## [356] "(R)" "(S)" "(S)" "(C)" "(S)" ## [361] "(DC)" "(M)" "(M)" "(AC)" "(DC)" ## [366] "(A)" "(A)" "(S)" "(S)" "(A)" ## [371] "(C)" "(S)" "(S)" "(C)" "(R)" ## [376] "(S)" "(S)" NA "(A)" "(T)" ## [381] "(S)" "(A)" "(C)" "(C)" "(A)" ## [386] "(C)" "(DC)" "(C)" "(A)" "(A)" ## [391] "(A)" "(S)" "(DC)" "(DC)" "(S)" ## [396] "(M)" "(R)" "(DC)" "(C)" "(S)" ## [401] "(S)" "(C)" "(C)" "(C)" "(C)" ## [406] "(C)" "(S)" "(A)" NA "(S)" ## [411] "(C)" "(S)" "(M)" "(C)" "(S)" ## [416] "(S)" NA "(C)" "(S)" "(C)" ## [421] "(DC)" "(S)" "(C)" "(S)" "(C)" ## [426] "(M)" "(A)" "(A)" "(A)" "(S)" ## [431] "(C)" "(S)" "(S)" "(S)" "(A)" ## [436] "(A)" "(A)" "(S)" "(S)" "(S)" ## [441] "(C)" "(S)" "(C)" "(C)" "(C)" ## [446] "(C) (NSW)" "(S) (Qld)" "(R) (Qld)" "(DC) (SA)" "(C) (SA)" ## [451] "(M) (Tas.)" "(M) (Tas.)" "(C)" "(R)" "(M)" ## [456] "(C)" "(R)" "(S)" "(RC)" "(S)" ## [461] "(M)" "(C)" "(R)" "(C)" "(DC)" ## [466] "(C)" "(C)" "(M)" "(C)" "(S)" ## [471] "(C)" "(DC)" "(M)" "(S)" "(C)" ## [476] "(C)" "(A)" "(DC)" "(R)" "(C)" ## [481] "(C)" "(A)" "(M)" "(C)" "(C)" ## [486] "(S)" "(S)" "(S)" "(A)" "(R)" ## [491] "(M)" "(A)" "(R)" "(A)" "(A)" ## [496] "(R)" "(R)" "(R)" "(S)" "(C)" ## [501] "(C)" "(S)" "(A)" "(S)" "(M)" ## [506] "(M)" "(S)" "(A)" "(A)" "(S)" ## [511] "(A)" "(C)" "(DC)" "(S)" "(S)" ## [516] NA "(A)" NA "(R)" "(C)" ## [521] "(S)" "(C)" "(S)" "(A)" "(A)" ## [526] "(A)" "(A)" "(C)" "(A)" "(A)" ## [531] "(A)" "(A)" "(C) (NSW)" "(A)" "(C)" ## [536] "(R)" "(S)" "(A)" "(R)" "(C)" ## [541] "(A)" "(S)" "(A)" "(A)"
str_extract(LGA, "\\(.+\\)") %>% table()
## .## (A) (AC) (B) (C) (C) (NSW) (C) (SA) (C) (Vic.) ## 100 2 1 120 2 1 2 ## (DC) (DC) (SA) (M) (M) (Tas.) (R) (R) (Qld) (RC) ## 40 1 23 4 38 1 7 ## (RegC) (S) (S) (Qld) (T) ## 1 182 1 12
str_extract(LGA, "\\(.+\\)") %>% table()
## .## (A) (AC) (B) (C) (C) (NSW) (C) (SA) (C) (Vic.) ## 100 2 1 120 2 1 2 ## (DC) (DC) (SA) (M) (M) (Tas.) (R) (R) (Qld) (RC) ## 40 1 23 4 38 1 7 ## (RegC) (S) (S) (Qld) (T) ## 1 182 1 12
Where the same Local Government Area name appears in different States or Territories, the State or Territory abbreviation appears in parenthesis after the name. Local Government Area names are therefore unique.
-Australian Bureau of Statistics
str_extract(LGA, "\\([^)]+\\)") %>% table()
## .## (A) (AC) (B) (C) (DC) (M) (R) (RC) (RegC) (S) (T) ## 100 2 1 125 41 27 39 7 1 183 12
str_extract(LGA, "\\([^)]+\\)") %>% # remove the brackets str_replace_all("[\\(\\)]", "") %>% table()
## .## A AC B C DC M R RC RegC S T ## 100 2 1 125 41 27 39 7 1 183 12
"[]"
for single character match(
and )
but these are meta-characters\(
and \)
\\(
\\)
str_extract(LGA, r"(\([^)]+\))") %>% # remove the brackets str_replace_all(r"([\(\)])", "") %>% table()
## .## A AC B C DC M R RC RegC S T ## 100 2 1 125 41 27 39 7 1 183 12
household_id | person | gender | age | maritial_status | income_per_week |
---|---|---|---|---|---|
1 | John Smith | F | 40 | Married | 400-499 |
1 | Jane Smith | M | 39 | Married | 300-399 |
1 | David Smith | M | 10 | Never married | Nil |
1 | Mary Smith | F | 8 | Never married | Nil |
2 | John Citizen | M | 32 | Never married | 400-499 |
2 | Jane Citizen | F | 33 | Never married | 1750-1999 |
2016_GCP_Sequential_Template.xlsx
, Sheet "G 17a", footnote says "Please note that there are small random adjustments made to all cell values to protect the confidentiality of data. These adjustments may cause the sum of rows or columns to differ by small amounts from table totals."2016_GCP_Sequential_Template.xlsx
, Sheet "G 17a", footnote says "Please note that there are small random adjustments made to all cell values to protect the confidentiality of data. These adjustments may cause the sum of rows or columns to differ by small amounts from table totals."Do you think that you'll get the same numbers if you use the ones from different geographical code? E.g. SA1
and STE
.
2016_GCP_Sequential_Template.xlsx
, Sheet "G 17a", footnote says "Please note that there are small random adjustments made to all cell values to protect the confidentiality of data. These adjustments may cause the sum of rows or columns to differ by small amounts from table totals."Do you think that you'll get the same numbers if you use the ones from different geographical code? E.g. SA1
and STE
.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Lecturer: Emi Tanaka
Department of Econometrics and Business Statistics
ETC5512.Clayton-x@monash.edu
Week 4
Lecturer: Emi Tanaka
Department of Econometrics and Business Statistics
ETC5512.Clayton-x@monash.edu
Week 4
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |