{"title":"大数据:数据库事务的酸性与碱性","authors":"Narsimha Banothu, ShankarNayak Bhukya, K. Sharma","doi":"10.1109/ICEEOT.2016.7755401","DOIUrl":null,"url":null,"abstract":"Database developers all know the ACID acronym. It says that database transactions should be: Atomic, Consistent, Isolated, and Durable. These qualities seem indispensable, and yet they are incompatible with availability and performance in very large systems. For example, suppose you run an online book store and you proudly display how many of each book you have in your inventory. Every time someone is in the process of buying a book, you lock part of the database until they finish so that all visitors around the world will see accurate inventory numbers. That works well if you run The Shop around the Corner but not if you run Amazon.com. Amazon might instead use cached data. Users would not see not the inventory count at this second, but what it was say an hour ago when the last snapshot was taken. Also, Amazon might violate the “I” in ACID by tolerating a small probability that simultaneous transactions could interfere with each other. For example, two customers might both believe that they just purchased the last copy of a certain book. The company might risk having to apologize to one of the two customers (and maybe compensate them with a gift card) rather than slowing down their site and irritating myriad other customers. There is a computer science theorem that quantifies the inevitable trade-offs. Eric Brewer's CAP theorem says that if you want consistency, availability, and partition tolerance, you have to settle for two out of three. (For a distributed system, partition tolerance means the system will continue to work unless there is a total network failure. A few nodes can fail and the system keeps going.) An alternative to ACID is BASE: Basic Availability, Soft-state, Eventual consistency.","PeriodicalId":383674,"journal":{"name":"2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Big-data: Acid versus base for database transactions\",\"authors\":\"Narsimha Banothu, ShankarNayak Bhukya, K. Sharma\",\"doi\":\"10.1109/ICEEOT.2016.7755401\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Database developers all know the ACID acronym. It says that database transactions should be: Atomic, Consistent, Isolated, and Durable. These qualities seem indispensable, and yet they are incompatible with availability and performance in very large systems. For example, suppose you run an online book store and you proudly display how many of each book you have in your inventory. Every time someone is in the process of buying a book, you lock part of the database until they finish so that all visitors around the world will see accurate inventory numbers. That works well if you run The Shop around the Corner but not if you run Amazon.com. Amazon might instead use cached data. Users would not see not the inventory count at this second, but what it was say an hour ago when the last snapshot was taken. Also, Amazon might violate the “I” in ACID by tolerating a small probability that simultaneous transactions could interfere with each other. For example, two customers might both believe that they just purchased the last copy of a certain book. The company might risk having to apologize to one of the two customers (and maybe compensate them with a gift card) rather than slowing down their site and irritating myriad other customers. There is a computer science theorem that quantifies the inevitable trade-offs. Eric Brewer's CAP theorem says that if you want consistency, availability, and partition tolerance, you have to settle for two out of three. (For a distributed system, partition tolerance means the system will continue to work unless there is a total network failure. A few nodes can fail and the system keeps going.) An alternative to ACID is BASE: Basic Availability, Soft-state, Eventual consistency.\",\"PeriodicalId\":383674,\"journal\":{\"name\":\"2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)\",\"volume\":\"96 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEEOT.2016.7755401\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEEOT.2016.7755401","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
摘要
数据库开发人员都知道ACID这个缩写。它说数据库事务应该是:原子的、一致的、隔离的和持久的。这些品质似乎是不可或缺的,但是它们与非常大的系统中的可用性和性能不相容。例如,假设您经营一家在线书店,并且您自豪地显示您的库存中每种书的数量。每当有人在购买一本书的过程中,你就会锁定部分数据库,直到他们完成购买,这样世界各地的所有访问者都能看到准确的库存数字。如果你经营的是The Shop around The Corner,这种方法就会奏效,但如果你经营的是Amazon.com,就行不通了。亚马逊可能会转而使用缓存数据。用户不会看到这一秒的库存数量,而是看到一小时前最后一次快照时的情况。此外,Amazon可能会因为容忍同时发生的事务相互干扰的小概率而违反ACID中的“I”。例如,两个客户可能都认为他们刚刚购买了某本书的最后一本。该公司可能不得不冒险向两个客户中的一个道歉(可能用礼品卡补偿他们),而不是放慢他们的网站,激怒无数其他客户。有一个计算机科学定理可以量化这些不可避免的权衡。Eric Brewer的CAP定理说,如果您想要一致性、可用性和分区容忍度,您必须满足三个中的两个。(对于分布式系统,分区容忍意味着系统将继续工作,除非整个网络出现故障。几个节点可能会出现故障,但系统会继续运行。)ACID的替代方案是BASE:基本可用性、软状态、最终一致性。
Big-data: Acid versus base for database transactions
Database developers all know the ACID acronym. It says that database transactions should be: Atomic, Consistent, Isolated, and Durable. These qualities seem indispensable, and yet they are incompatible with availability and performance in very large systems. For example, suppose you run an online book store and you proudly display how many of each book you have in your inventory. Every time someone is in the process of buying a book, you lock part of the database until they finish so that all visitors around the world will see accurate inventory numbers. That works well if you run The Shop around the Corner but not if you run Amazon.com. Amazon might instead use cached data. Users would not see not the inventory count at this second, but what it was say an hour ago when the last snapshot was taken. Also, Amazon might violate the “I” in ACID by tolerating a small probability that simultaneous transactions could interfere with each other. For example, two customers might both believe that they just purchased the last copy of a certain book. The company might risk having to apologize to one of the two customers (and maybe compensate them with a gift card) rather than slowing down their site and irritating myriad other customers. There is a computer science theorem that quantifies the inevitable trade-offs. Eric Brewer's CAP theorem says that if you want consistency, availability, and partition tolerance, you have to settle for two out of three. (For a distributed system, partition tolerance means the system will continue to work unless there is a total network failure. A few nodes can fail and the system keeps going.) An alternative to ACID is BASE: Basic Availability, Soft-state, Eventual consistency.