像是Java的HashMap、Python裡的Dictionary等等
在R裡的list也是一種key-value資料格式,個人覺得蠻好用的,
但是如果要塞進list的東西很多,就會遇到麻煩的效能問題,就是速度會拖慢啦
hash套件提供一種key-value的資料格式就叫hash
在執行速度上比list要來得快頗多
1. 使用.set(hash, key = value,...) 函數給定key-value值
> library(hash) > > test<- hash() > > .set( test, + name = "AirQuality", + May = datasets::airquality[which(datasets::airquality$Mon==5),], + Jun = datasets::airquality[which(datasets::airquality$Mon==5),], + Jul = datasets::airquality[which(datasets::airquality$Mon==5),], + Aug = datasets::airquality[which(datasets::airquality$Mon==5),] + ) >
然後就可以像list一樣輸入key值叫出value了!
> head(test[["May"]]) Ozone Solar.R Wind Temp Month Day 1 41 190 7.4 67 5 1 2 36 118 8.0 72 5 2 3 12 149 12.6 74 5 3 4 18 313 11.5 62 5 4 5 NA NA 14.3 56 5 5 6 28 NA 14.9 66 5 6 > > head(test$May) Ozone Solar.R Wind Temp Month Day 1 41 190 7.4 67 5 1 2 36 118 8.0 72 5 2 3 12 149 12.6 74 5 3 4 18 313 11.5 62 5 4 5 NA NA 14.3 56 5 5 6 28 NA 14.9 66 5 6 >
或是直接insert key-value也可以
> test$"Sep"<- datasets::airquality[which(datasets::airquality$Mon==5),] > > head(test[["Sep"]]) Ozone Solar.R Wind Temp Month Day 1 41 190 7.4 67 5 1 2 36 118 8.0 72 5 2 3 12 149 12.6 74 5 3 4 18 313 11.5 62 5 4 5 NA NA 14.3 56 5 5 6 28 NA 14.9 66 5 6
2. 用 has.key(key,hash) 和 keys(hash) 函數檢查hash裡有哪些key
> has.key(c("May","Jun"),test) May Jun TRUE TRUE > keys(test) [1] "Aug" "Jul" "Jun" "May" "name" "Sep"
3. 用copy(hash)函數複製hash
官方文件裡建議不要直接用<- assign給某物件,所以在assign的時候最好用copy(hash)函數把hash直接複製給物件
> test2<- copy(test)
4. 用clear(hash)函數清理hash
如果直接用rm(hash)的話,hash所佔據記憶體容量並不會釋放,得先用clear(hash)把hash清空然後再刪掉
> clear(test) > is.empty(test) [1] TRUE > rm(test)
5. 效能比較
最後來做一個
> test1<- hash() > test2<- list() > system.time( + for (i in 1:10000){ + test1[[as.character(i)]]<- 1:i + } + ) user system elapsed 1.082 0.059 1.139 > system.time( + for (i in 1:10000){ + test2[[as.character(i)]]<- 1:i + } + ) user system elapsed 1.770 0.199 2.057
速度大概是兩倍左右,完畢!