sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

596
active users

Useful new tip I learned from by @jh :

Train a model _before_ data cleaning to identify the items that are the most troublesome.

This might also help you identify the types of problems that are occurring so that next time it can help you in collecting better datasets.

Dr Kamran Soomro

So I’ve been wondering what problem to work on for my learning while I work my way through . I think I’ve found the perfect dataset: figshare.com/collections/_/456

It’s got everything. The raw 12-lead ECG readings as well as arrhythmia classifications for over 10,000 patients. And best of all it’s easy to convert the ECG readings to images for . Will probably start by trying out initially.

Will keep posting my progress here.

figshareA 12-lead electrocardiogram database for arrhythmia research covering more than 10,000 patientsThis newly inaugurated research database for 12-lead electrocardiogram signals was created under the auspices of Chapman University and Shaoxing People's Hospital (Shaoxing Hospital Zhejiang University School of Medicine) and aims at enabling the scientific community in conducting new studies on arrhythmia and other cardiovascular conditions. Certain types of arrhythmias, such as atrial fibrillation, have a pronounced negative impact on public health, quality of life, and medical expenditures. As a non-invasive test, the long term ECG monitoring is a major and vital diagnostic tool for detecting these conditions. However, such a practice generates a considerable amount of data that analyzes of which require considerable time and effort by human experts. Advancement of modern machine learning and statistical tools can be trained on high quality, large data to achieve high levels of automated diagnostic accuracy. Thus, we collected and disseminated this novel database that contains 12-lead ECGs of 10,646 patients with 500 Hz sampling rate that features 11 common rhythms and 67 additional cardiovascular conditions, all labeled by professional experts. For each subject, a sample size of 10 seconds (12-dimension 5000 samples) was available. The dataset can be used to design, compare, and fine tune new and classical statistical and machine learning techniques in studies focused on arrhythmia and other cardiovascular conditions.Citation:Zheng, J., Chu, H., Struppa, D. et al. Optimal Multi-Stage Arrhythmia Classification Approach. Sci Rep 10, 2898 (2020). https://doi.org/10.1038/s41598-020-59821-7Zheng, J., Zhang, J., Danioko, S. et al. A 12-lead electrocardiogram database for arrhythmia research covering more than 10,000 patients. Sci Data 7, 48 (2020). https://doi.org/10.1038/s41597-020-0386-x