YouTube 8M dataset download

Tags more

Archives

관리 메뉴

Shakerato

Research

Shakeratto 2018. 6. 1. 17:02

* What is YouTube 8M dataset?

* How to download the dataset?

* You have to download the dataset only on the Linux! (not Windows),

because of duplicated file names.

(Windows does not allow this situation: 'Ab.txt', 'aB.txt' in the same folder)

1. Download 'download_fix.py' file from this URL:

2. Change codes to download the dataset (video-level feature)

No need to follow this step. You can download dataset using this command

'curl data.yt8m.org/download.py | partition=2/video/train mirror=us python'

I modified the code because sometimes downloading is interrupted due to the network latency.

2.1. os.environ['partition'] -> '2/video/train', '2/video/validate', '2/video/test'

2.2. os.environ['mirror'] -> 'us' or 'eu' or 'asia'

2.3. Comment all codes for checking 'os.environ' is setted as (partition, mirror, shard)

2.4. Comment the code for checking 'curl' is installed

2.5. Change the code for download to like below link (using urllib) - this step is impotent.

2.5.1. os.system('curl %s > %s' % (plan_url, plan_filename))

-> fileDownload(plan_url, plan_filename)

2.5.2. os.system('curl %s > %s' % (download_url, f))

->fileDownload(download_url, f)

3. Run code for each dataset (train, validate, test) (change the code at the step 2.1)

YouTube8M simplest model to train (0)	2018.06.09
YouTube-8M dataset file renaming for avoiding duplicate file names in Windows (Python code) (0)	2018.06.06
Python code to download file on the internet using urllib library (python 3) (0)	2018.06.01
pip install -r requirements.txt 라이브러리 설치 팁 (버전 상관없이 최신버전 설치) (0)	2018.05.26
cube i7 tablet g-sensor driver(Kionic) on windows 10 makes flip screen problem (0)	2018.05.26

'Research' Related Articles

Comments