database 数据集和数据库有什么区别?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7782594/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 08:21:37  来源:igfitidea点击:

What is the difference between dataset and database?

databasedataset

提问by Lokesh Sah

What is the difference between a dataset and a database ? If they are different then how ?

数据集和数据库有什么区别?如果它们不同,那么如何?

Why is huge data difficult to be manageusing databases today?!

为什么今天使用数据库很难管理海量数据?!

Please answer independent of any programming language.

请独立于任何编程语言回答。

回答by Mike Sherrill 'Cat Recall'

In American English, databaseusually means "an organized collection of data". A database is usually under the control of a database management system, which is software that, among other things, manages multi-user access to the database. (Usually, but not necessarily. Some simple databases are just text files processed with interpreted languages like awk and Python.)

在美式英语中,database通常表示“有组织的数据集合”。数据库通常受数据库管理系统的控制,该系统是管理多用户访问数据库的软件。(通常,但不一定。一些简单的数据库只是用 awk 和 Python 等解释性语言处理的文本文件。)

In the SQL world, which is what I'm most familiar with, a database includes things like tables, views, stored procedures, triggers, permissions, and data.

在我最熟悉的 SQL 世界中,数据库包括表、视图、存储过程、触发器、权限和数据等内容。

Again, in American English, datasetusually refers to data selected and arranged in rows and columns for processing by statistical software. The data might have come from a database, but it might not.

再次,在美式英语中,数据集通常是指选择并按行和列排列的数据,供统计软件处理。数据可能来自数据库,但也可能不是。

回答by jetgrrrl

A dataset is the data... usually in a table or can be XML or other types of data however it's only data... it doesn't really do anything.

数据集是数据……通常在表中,也可以是 XML 或其他类型的数据,但它只是数据……它实际上并没有做任何事情。

And as you know a database is a container for the dataset usually with built in infrastructure around it to interact with it.

正如您所知,数据库是数据集的容器,通常具有内置的基础设施来与之交互。

Huge data isn't hard to manage for what I do. I guess you're asking a study related question?

对于我所做的工作,管理庞大的数据并不难。我猜你在问一个研究相关的问题?

回答by G M

Database

数据库

The definition of the two terms is not always clear. In general a databaseis a set of data organized and accessible using a database management system (DBMS). Databases usually, but not always, are composed of several tableslinked together often accessed, modified and updated by various users often simultaneously.

这两个术语的定义并不总是很清楚。通常,数据库是使用数据库管理系统 (DBMS)组织和访问的一组数据。数据库通常(但并非总是)由多个链接在一起的组成,这些经常由不同的用户同时访问、修改和更新。

Cambridge dictionary:

剑桥词典:

A structured set of data held in a computer, especially one that is accessible in various ways.

计算机中保存的一组结构化数据,尤其是可以通过各种方式访问​​的数据。

Merriam-webster

韦伯斯特

a usually large collection of data organized especially for rapid search and retrieval (as by a computer)

通常为快速搜索和检索而组织的大量数据集合(如通过计算机)

Data set (or dataset)

数据集(或数据集)

A data setsometimes refer to the contents of a single database table, but this is quite a restrictive definition. In general, as the name suggests, is a set (or collection) of data hence there are datasets of images like Caltech-256 Object Category Datasetor videos e.g. A large-scale benchmark dataset for event recognition in surveillance video. A data set purpose is usually designed for the analysis rather to a continual update form different users, hence represent the end of a collection of data or a snapshot of a specific time.

数据集有时指的内容单个数据库表,但是这是一个相当严格的定义。一般来说,顾名思义,是一组(或集合)数据,因此有图像数据集,如Caltech-256 对象类别数据集或视频,例如监控视频中事件识别的大规模基准数据集。数据集的目的通常是为分析而设计,而不是为不同用户的持续更新而设计,因此代表数据集合的结束或特定时间的快照。

Oxford dictionary:

牛津词典:

A collection of related sets of information that is composed of separate elements but can be manipulated as a unit by a computer.

‘all hospitals must provide a standard data set of each patient's details'

由不同元素组成但可以作为一个单元由计算机操作的相关信息集的集合。

“所有医院都必须提供每个患者详细信息的标准数据集”

Cambridge dictionary

剑桥词典

a collection of separate sets of information that is treated as a single unit by a computer

被计算机视为一个单元的独立信息集的集合

回答by Maiden

Dataset is just a set of data (maybe related to someone and may not be for others ) whereas Database is a software/hardware component that organizes and stores data or dataset. Both are different things practically.

数据集只是一组数据(可能与某人有关,也可能与其他人无关),而数据库是组织和存储数据或数据集的软件/硬件组件。两者实际上是不同的东西。

Huge data needs more infrastructure and components (hardware & software) or computing power & storage for efficient storage or retrieval of data's . More huge data means more components hence difficult. Modern days database provides good infrastructure to handle huge data's processing (both read/write) , check datalake management by Microsoft which manages relational data or dataset extensively.

巨大的数据需要更多的基础设施和组件(硬件和软件)或计算能力和存储,以便有效地存储或检索数据的 . 更庞大的数据意味着更多的组件因此变得困难。现代数据库提供了良好的基础设施来处理大量数据的处理(读/写),检查由微软广泛管理关系数据或数据集的数据湖管理。