Recent advances in data science applications have led to groundbreaking benefits – from health sciences, where social network analytics have enabled the tracking of epidemics; to financial systems, where guidance of investment decisions are based on analysis of large volumes of data; to speech recognition applications and more. These achievements only hint at what is possible—the full impact of data science is yet to be realized. Data science is inherently multidisciplinary. It builds on a core set of technologies in close interaction with application domains (e.g., finance, health, environment), that leverage this core to solve their most pressing problems. The ability to extract insights from the massive expansion of data depends on a comprehensive, systematic and end-to-end study of the data science lifecycle. Data science research builds on a core set of technologies associated with stages in this lifecycle. Issues of data security and privacy, as well as the impact of data science on society and on policy, intersect with each of these stages.
Although there is existing research on many of these topics, an holistic approach is required focusing on four data science challenges: (1) the need to develop core technologies in each stage of the data lifecycle; (2) the gap in understanding and leveraging the interconnections among these stages; (3) cross-cutting themes of security, privacy, ethics, policy and social impact; (4) synergy with applications.
The Canadian Workshop on Data Science (CWDS) is organized to focus on these issues by bringing together researchers, industries, public and not-for-profit organizations with interests in data science to formulate a broad plan for research and engagement. The outcome of the workshop will be a fuller understanding of the research issues and a plan for better engagement among stakeholders.
The program is organized to support this objective:
- Two plenary keynote addresses from experienced leaders of similar initiatives.
- Break-out sessions will focus on the fundamental core technologies as well as interactions among them, with representatives of all stakeholder groups in each session.
- A panel discussion focussed on user communities, both those generating large volumes of data and those using large volumes of data for decision making, to highlight their data science needs and stress points.