Open filters Close filters

16 Results

Building Legal Literacies for Text Data Mining

Unrestricted Use

Public Domain

Building Legal Literacies for Text Data Mining

Rating

For those learning about fair use, this is a specific example of how fair use may be used in research for text data mining. The book also explores basic copyright and fair use more generally, as well as the specifics of text data mining.
From the "about" section of the book:

"This book explores the legal literacies covered during the virtual Building Legal Literacies for Text Data Mining Institute, including copyright (both U.S. and international law), technological protection measures, privacy, and ethical considerations. It describes in detail how we developed and delivered the 4-day institute, and also provides ideas for hosting shorter literacy teaching sessions. Finally, we offer reflections and take-aways on the Institute."

Subject:: Applied Science; Computer Science; Information Science; Law
Material Type:: Reading
Provider:: UC Berkeley
Author:: Beth Cate; Brandon Butler; Brianna L; Courtney Brianna L Schofield; Courtney Glen Worthey; David Bamman; Maria Gould; Megan Senseney; Scott Althaus; Thomas Padilla
Date Added:: 10/29/2021

Unrestricted Use

Public Domain

DASHlink

Rating

DASHlink is a virtual laboratory for scientists and engineers to disseminate results and collaborate on research problems in health management technologies for aeronautics systems. Managed by the Integrated Vehicle Health Management project within NASA's Aviation Safety program, the Web site is designed to be a resource for anyone interested in data mining, IVHM, aeronautics and NASA.

Subject:: Applied Science; Computer Science; Engineering
Material Type:: Lecture; Primary Source; Reading; Simulation
Provider:: NASA
Date Added:: 07/11/2003

Unrestricted Use

CC BY

Data Mining

Rating

A number of successful applications have been reported in areas such as credit rating, fraud detection, database marketing, customer relationship management, and stock market investments.
This course will examine methods that have emerged from both fields and proven to be of value in recognizing patterns and making predictions from an applications perspective. We will survey applications and provide an opportunity for hands-on experimentation with algorithms for data mining using easy-to- use software and cases.

Material Type:: Full Course
Date Added:: 02/16/2015

The Data Renaissance: Analyzing the Disciplinary Effects of Big Data, Artificial Intelligence, and Beyond

Conditional Remix & Share Permitted

CC BY-NC-SA

The Data Renaissance: Analyzing the Disciplinary Effects of Big Data, Artificial Intelligence, and Beyond

Rating

The Data Renaissance delves into the complexities of data's role in various industries and its broader impact on society. It highlights the challenges in investigating data practices, citing examples like TikTok, where algorithms and data handling are closely guarded secrets. The content, contributed by students under the guidance of an expert, covers a wide range of topics, including the ethical aspects of generative AI in education and the workplace, and case studies reflecting real-world experiences. This evolving text, intended to be updated with each class, serves as a dynamic resource for educators and students alike, offering insights and discussion guides for an in-depth understanding of the ever-changing landscape of data in our digital age.

Subject:: Applied Science; Computing and Information
Material Type:: Activity/Lab; Textbook
Provider:: Remixing Open Textbooks through an Equity Lens (ROTEL) Project
Author:: J.J. Sylvia Iv
Date Added:: 03/07/2024

Database (08:04): Some Final Bits

Only Sharing Permitted

CC BY-ND

Database (08:04): Some Final Bits

Rating

Our final database video. This one looks at some odds and ends. We examine: Data Warehouse, Data Mining, Big Data. I also talk about the ethics of data mining from the NSA and CDC, and how they are different.

We also give out top picks for the lesson.

Links from Video:
•http://www.w3schools.com/sql/
•What is Database & SQL by Guru99 http://youtu.be/FR4QIeZaPeM
•What is a database http://youtu.be/t8jgX1f8kc4
•MySQL Database For Beginners https://www.udemy.com/mysql-database-for-beginners2/

Subject:: Applied Science; Business and Communication; Information Science
Material Type:: Lecture
Provider:: Mr. Ford's Class
Author:: Scott Ford
Date Added:: 09/26/2014

Digital Humanities

Conditional Remix & Share Permitted

CC BY-NC-SA

Digital Humanities

Rating

This course examines the theory and practice of using computational methods in the emerging field of digital humanities. It develops an understanding of key digital humanities concepts, such as data representation, digital archives, information visualization, and user interaction through the study of contemporary research, in conjunction with working on real-world projects for scholarly, educational, and public needs. Students create prototypes, write design papers, and conduct user studies.

Subject:: Applied Science; Arts and Humanities; Career and Technical Education; Computer Science; Engineering; Graphic Arts; Graphic Design
Material Type:: Full Course
Provider:: MIT
Provider Set:: MIT OpenCourseWare
Author:: Fendt, Kurt; Stuhl, Andy Kelleher
Date Added:: 02/01/2015

Engaging Researchers with Data Management: The Cookbook

Unrestricted Use

CC BY

Engaging Researchers with Data Management: The Cookbook

Rating

Effective Research Data Management (RDM) is a key component of research integrity and reproducible research, and its importance is increasingly emphasised by funding bodies, governments, and research institutions around the world. However, many researchers are unfamiliar with RDM best practices, and research support staff are faced with the difficult task of delivering support to researchers across different disciplines and career stages. What strategies can institutions use to solve these problems?

Engaging Researchers with Data Management is an invaluable collection of 24 case studies, drawn from institutions across the globe, that demonstrate clearly and practically how to engage the research community with RDM. These case studies together illustrate the variety of innovative strategies research institutions have developed to engage with their researchers about managing research data. Each study is presented concisely and clearly, highlighting the essential ingredients that led to its success and challenges encountered along the way. By interviewing key staff about their experiences and the organisational context, the authors of this book have created an essential resource for organisations looking to increase engagement with their research communities.

This handbook is a collaboration by research institutions, for research institutions. It aims not only to inspire and engage, but also to help drive cultural change towards better data management. It has been written for anyone interested in RDM, or simply, good research practice.

Subject:: Applied Science; Information Science
Material Type:: Textbook
Provider:: Open Book Publishers
Author:: Connie Clare; Elli Papadopoulou; Iza Witkowska; James Savage; Joanne Yeomans; Maria Cruz; Marta Teperek; Yan Wang
Date Added:: 11/01/2020

Management of Services: Concepts, Design, and Delivery

Conditional Remix & Share Permitted

CC BY-NC-SA

Management of Services: Concepts, Design, and Delivery

Rating

15.768 Management of Services: Concepts, Design, and Delivery explores the use of operations tools and perspectives in the service sector, including both for-profit and not-for-profit organizations. The course builds on conceptual frameworks and cases from a wide range of service operations, selected from health care, hospitality, internet services, supply chain, transportation, retailing, food service, entertainment, financial services, humanitarian services, government services, and others.

Subject:: Applied Science; Business and Communication; Computer Science; Engineering; Management
Material Type:: Full Course
Provider:: MIT
Provider Set:: MIT OpenCourseWare
Author:: Fine, Charles
Date Added:: 09/01/2010

Mathematics of Big Data and Machine Learning

Conditional Remix & Share Permitted

CC BY-NC-SA

Mathematics of Big Data and Machine Learning

Rating

This course introduces the Dynamic Distributed Dimensional Data Model (D4M), a breakthrough in computer programming that combines graph theory, linear algebra, and databases to address problems associated with Big Data. Search, social media, ad placement, mapping, tracking, spam filtering, fraud detection, wireless communication, drug discovery, and bioinformatics all attempt to find items of interest in vast quantities of data. This course teaches a signal processing approach to these problems by combining linear algebraic graph algorithms, group theory, and database design. This approach has been implemented in software. The class will begin with a number of practical problems, introduce the appropriate theory, and then apply the theory to these problems. Students will apply these ideas in the final project of their choosing. The course will contain a number of smaller assignments which will prepare the students with appropriate software infrastructure for completing their final projects.

Subject:: Applied Science; Business and Communication; Computer Science; Engineering
Material Type:: Full Course
Provider:: MIT
Provider Set:: MIT OpenCourseWare
Author:: Gadepally, Vijay; Kepner, Jeremy
Date Added:: 01/01/2020

Modeling and Assessment for Policy

Conditional Remix & Share Permitted

CC BY-NC-SA

Modeling and Assessment for Policy

Rating

IDS.410J Modeling and Assessment for Policy explores how scientific information and quantitative models can be used to inform policy decision-making. Students will develop an understanding of quantitative modeling techniques and their role in the policy process through case studies and interactive activities. The course addresses issues such as analysis of scientific assessment processes, uses of integrated assessment models, public perception of quantitative information, methods for dealing with uncertainties, and design choices in building policy-relevant models. Examples used in this class focus on models and information used in earth system governance.

Subject:: Applied Science; Atmospheric Science; Career and Technical Education; Computer Science; Engineering; Environmental Science; Environmental Studies; Mathematics; Physical Science; Political Science; Social Science
Material Type:: Full Course
Provider:: MIT
Provider Set:: MIT OpenCourseWare
Author:: Selin, Noelle
Date Added:: 02/01/2013

Prediction: Machine Learning and Statistics

Conditional Remix & Share Permitted

CC BY-NC-SA

Prediction: Machine Learning and Statistics

Rating

Prediction is at the heart of almost every scientific discipline, and the study of generalization (that is, prediction) from data is the central topic of machine learning and statistics, and more generally, data mining. Machine learning and statistical methods are used throughout the scientific world for their use in handling the "information overload" that characterizes our current digital age. Machine learning developed from the artificial intelligence community, mainly within the last 30 years, at the same time that statistics has made major advances due to the availability of modern computing. However, parts of these two fields aim at the same goal, that is, of prediction from data. This course provides a selection of the most important topics from both of these subjects.

Subject:: Applied Science; Computer Science; Engineering; Mathematics; Statistics and Probability
Material Type:: Full Course
Provider:: MIT
Provider Set:: MIT OpenCourseWare
Author:: Rudin, Cynthia
Date Added:: 02/01/2012

Statistical Thinking and Data Analysis

Conditional Remix & Share Permitted

CC BY-NC-SA

Statistical Thinking and Data Analysis

Rating

This course is an introduction to statistical data analysis. Topics are chosen from applied probability, sampling, estimation, hypothesis testing, linear regression, analysis of variance, categorical data analysis, and nonparametric statistics.

Subject:: Applied Science; Computer Science; Engineering; Mathematics; Statistics and Probability
Material Type:: Full Course
Provider:: MIT
Provider Set:: MIT OpenCourseWare
Author:: Bisias, Dimitrios; Chang, Allison; Rudin, Cynthia
Date Added:: 09/01/2011

Systems Optimization: Models and Computation (SMA 5223)

Conditional Remix & Share Permitted

CC BY-NC-SA

Systems Optimization: Models and Computation (SMA 5223)

Rating

This class is an applications-oriented course covering the modeling of large-scale systems in decision-making domains and the optimization of such systems using state-of-the-art optimization tools. Application domains include: transportation and logistics planning, pattern classification and image processing, data mining, design of structures, scheduling in large systems, supply-chain management, financial engineering, and telecommunications systems planning. Modeling tools and techniques include linear, network, discrete and nonlinear optimization, heuristic methods, sensitivity and post-optimality analysis, decomposition methods for large-scale systems, and stochastic optimization.
This course was also taught as part of the Singapore-MIT Alliance (SMA) programme as course number SMA 5223 (System Optimisation: Models and Computation).

Subject:: Applied Science; Computer Science; Engineering; Mathematics
Material Type:: Full Course
Provider:: MIT
Provider Set:: MIT OpenCourseWare
Author:: Freund, Robert; Magnanti, Thomas; Sun, Jie
Date Added:: 02/01/2004

Understanding Data Mining

Read the Fine Print

Educational Use

Understanding Data Mining

Rating

Students learn basic data analysis tools and techniques in AP Statistics, but often dont work with large sets of real-world data. This project gives students exposure to how data is analyzed in many of Americas top corporations, universities and banks. By using multiple input variables, students learn how to develop realistic prediction models for the demand for goods and services.

Subject:: Mathematics; Statistics and Probability
Material Type:: Lesson Plan
Provider:: North Carolina State University
Provider Set:: Kenan Fellows Program for Curriculum and Leadership Development
Author:: Celia Rowland
Date Added:: 03/03/2016

Understanding algorithms and big data in the job market

Conditional Remix & Share Permitted

CC BY-NC-SA

Understanding algorithms and big data in the job market

Rating

This interactive lesson helps students understand how companies use algorithms to sort job applicants. It also encourages students to reflect on how digital data mining also can contribute to the hiring process. Students examine resumes and digital data to consider the ways in which our data may open or close opportunities in an increasingly digitized hiring market.

Subject:: Applied Science; Business and Communication; Computer Science; English Language Arts; Information Science
Material Type:: Lesson
Date Added:: 08/05/2019

Wide-Open: Accelerating public data release by automating detection of overdue datasets

Unrestricted Use

CC BY

Wide-Open: Accelerating public data release by automating detection of overdue datasets

Rating

Open data is a vital pillar of open science and a key enabler for reproducibility, data reuse, and novel discoveries. Enforcement of open-data policies, however, largely relies on manual efforts, which invariably lag behind the increasingly automated generation of biological data. To address this problem, we developed a general approach to automatically identify datasets overdue for public release by applying text mining to identify dataset references in published articles and parse query results from repositories to determine if the datasets remain private. We demonstrate the effectiveness of this approach on 2 popular National Center for Biotechnology Information (NCBI) repositories: Gene Expression Omnibus (GEO) and Sequence Read Archive (SRA). Our Wide-Open system identified a large number of overdue datasets, which spurred administrators to respond directly by releasing 400 datasets in one week.

Subject:: Biology; Life Science
Material Type:: Reading
Provider:: PLOS Biology
Author:: Bill Howe; Hoifung Poon; Maxim Grechkin
Date Added:: 08/07/2020