常用优化方法
任何一款数据库,在业务数据不断累积的情况下,迟早要碰到数据堆积,影响业务系统运行速度的事情。如何解决数据合理存储与使用问题,就成了技术人员必须面对的一个重要问题。
目前,对于数据库速度问题的处理,可以从横向和纵向两个方面进行技术处理,以改善数据库运行响应速度问题。
纵向的,就是在单服务器内部挖掘潜力,优化数据库操作技术细节,如以优化数据库索引,提高查询速度等方式来解决问题。但是,该方法只能局部改善问题,而且受服务器硬件整体性能所限,这方面的技术提升最终效果是有限的。
横向的,就是把本来由一台服务器处理的数据,转为分布式多服务器处理,并进行读、写分离操作,这样可以大幅度提高数据库的运行响应速度,并且横向扩展是线性的,受硬件设备本身的限制非常小。这对于具有分布使用的MongoDB数据库来说,是一个好的技术解决思路。
对于提高MongoDB的操作使用性能,顶级高手在设计阶段,就应该充分考虑数据读写操作性能,并在技术上做准确而合理的优化考虑;一般高手,通过工具主动及时发现问题,进行问题针对性的解决;水平比较低的,被动发现问题,并解决问题。
下面从设计、问题提早发现、问题针对性调优等角度,提供相应的技术解决思路。MongoDB数据库的优化内容涉及范围很广,常见的包括对单机各种命令执行的优化(如索引)、数据本身的优化、分布式读写优化等。
在数据规模还允许的情况下,先做单台服务器纵向优化措施。这是最常见的一类读写速度优化措施,也是数据库技术人员技术是否扎实的基本功之一。
一. 开启MongoDB慢命令检测功能(Database Profiler)
当电子商务平台某些功能界面操作相应慢时,对应的数据库系统是一个重点怀疑对象,但如何去检查和测试数据库相关运行命令呢?MongoDB提供了自动检测哪些命令执行太慢的方法,我们可以通过开启db.setProfilingLevel功能自动记录有问题的命令清单,并通过db.system.profile.find命令显示问题命令清单。有了问题命令清单,就可以迅速把注意力集中到问题命令上,进行针对性的测试和技术调优。
db.setProfilingLevel(level,slowms)命令参数:
- level为指定慢命令分析级别,0为不执行该命令;1为记录慢命令,当指定时间限制时(默认为超过100毫秒),超过该时间限制范围的执行命令都将记入system.profile文件中;2为记录所有执行命令的情况。
- slowms为可选参数,指定时间限制范围(单位:毫秒),当超过该时间范围时,所执行命令就认为是慢命令。
基本用法,在mongod上执行如下:
#开启慢命令记录模式,默认超过100毫秒的执行命令,都进行记录> db.setProfilingLevel(1)#开启慢命令记录模式,设置超过限制为300毫秒> db.setProfilingLevel(1,300)
在生产环境中,开启慢命令记录后,检查一段时间,就可以使用db.system.profile.find命令查看慢命令检测记录结果
> db.system.profile.find( { millis: { $gt:100 } } )# millis为系统慢命令记录日志里的时间键名(key)
为了演示,可以采用db.setProfilingLevel(2)方式,测试记录内容:
> db.setProfilingLevel(2)> db.stats()> db.system.profile.find().pretty()#显示如下:{"op" : "command","ns" : "test","command" : {"dbstats" : 1,"scale" : undefined,"lsid" : {"id" : UUID("1258dc1d-cd38-42af-ac60-f2f34c273690")},"$clusterTime" : {"clusterTime" : Timestamp(1580479289, 1),"signature" : {"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),"keyId" : NumberLong(0)}},"$db" : "test"},"numYield" : 0,"locks" : {"ParallelBatchWriterMode" : {"acquireCount" : {"r" : NumberLong(1)}},"ReplicationStateTransition" : {"acquireCount" : {"w" : NumberLong(1)}},"Global" : {"acquireCount" : {"r" : NumberLong(1)}},"Database" : {"acquireCount" : {"r" : NumberLong(1)}},"Collection" : {"acquireCount" : {"r" : NumberLong(3)}},"Mutex" : {"acquireCount" : {"r" : NumberLong(1)}}},"flowControl" : {},"responseLength" : 386,"protocol" : "op_msg","millis" : 0,"ts" : ISODate("2020-01-31T14:01:38.820Z"),"client" : "127.0.0.1","appName" : "MongoDB Shell","allUsers" : [ ],"user" : ""}
慢命令记录日志内的主要参数说明如下:
- op:记录的命令
- ns:数据库名
- ts:记录命令执行开始时间
- millis:改命令执行的消耗时间(单位:毫秒)
注意:
开启慢命令记录后,在服务器端会产生额外的运行开销,对服务器运行性能产生一定影响,测试完成后,必须设置db.setProfilingLevel(0)。
二. 使用explain命令分析问题语句
这里假设通过慢命令检测功能,发现了一个带索引的find()执行速度比较慢。于是进一步通过explain命令分析该命令的执行详细信息,其模拟执行过程如下:
> db.books.createIndex({name:1}) #对name键建立索引,books集合已经存在若干值> db.books.createIndex({price:1}) #对price键建立索引> db.books.find({price:{$gt:30}}, {name:1}).explain("executionStats") #对find命令执行explain()分析
三. Mtools工具分析日志
Mtools工具是MongoDB技术人员用于数据库运行健康检查的一款官方工具,可以通过图形化的界面展示监控数据运行情况,及时发现问题,及时解决问题。尤其是对MongoDB日志可以进行定向分析,以便更加容易地发现问题。
地址:https://github.com/rueckstiess/mtools
四. Mongoreplay监控工具
一个强大的流量捕获和重播工具,可用于检查和记录发送MongoDB数据库命令。官方地址:https://docs.mongodb.com/manual/reference/program/mongoreplay/index.html
五. Mongostat监控工具
Mongostat是MongoDB自带的状态检查工具,在命令行下使用,它会间隔固定时间获取MongoDB的当前运行状态,并输出。如果发现MongoDB响应速度变慢或者有其他问题,可以优先考虑采用该工具,快速获取MongoDB数据库的运行状态信息。该工具主要提供大量的统计信息。
#以1秒为间隔,返回20次状态数据[root@slave01 ~]# mongostat -n 20 1insert query update delete getmore command dirty used flushes vsize res qrw arw net_in net_out conn time*0 *0 *0 *0 0 1|0 0.0% 0.0% 0 1.48G 83.0M 0|0 1|0 165b 35.5k 3 Feb 1 15:53:35.523*0 *0 *0 *0 0 0|0 0.0% 0.0% 0 1.48G 83.0M 0|0 1|0 111b 35.4k 3 Feb 1 15:53:36.528*0 *0 *0 *0 0 1|0 0.0% 0.0% 0 1.48G 83.0M 0|0 1|0 112b 35.7k 3 Feb 1 15:53:37.525*0 *0 *0 *0 0 1|0 0.0% 0.0% 0 1.48G 83.0M 0|0 1|0 112b 35.7k 3 Feb 1 15:53:38.521*0 *0 *0 *0 0 0|0 0.0% 0.0% 0 1.48G 83.0M 0|0 1|0 111b 35.4k 3 Feb 1 15:53:39.525*0 *0 *0 *0 0 1|0 0.0% 0.0% 0 1.48G 83.0M 0|0 1|0 112b 35.7k 3 Feb 1 15:53:40.522
六. db.serverStaus命令
db.serverStaus也是了解MongoDB数据库运行状态的一种选择,提供磁盘使用状态、内存使用状态、连接、日志、可用的索引等信息。该命令响应迅速,不会影响MongoDB的运行性能。
> db.serverStatus(){"host" : "slave01", #服务器主机名"version" : "4.2.2", #当前MongoDB数据库版本号"process" : "mongod", #进程名"pid" : NumberLong(1777), #当前进程ID号"uptime" : 528, #当前数据库已经运行时间(单位,秒)"uptimeMillis" : NumberLong(527409), #当前数据库已经运行时间(单位,毫秒)"uptimeEstimate" : NumberLong(527), #MongoDB内部系统统计正常运行时间(单位,秒)"localTime" : ISODate("2020-02-01T07:25:54.031Z"), #UTC为单位的服务器当前时间,显示年月日时间"asserts" : { #自MongoDB启动以来出现的各种错误数量报告"regular" : 0,"warning" : 0,"msg" : 0,"user" : 12,"rollovers" : 0},"connections" : { #客户端连接服务器的状态统计报告"current" : 1,"available" : 51199,"totalCreated" : 1,"active" : 1},"electionMetrics" : {"stepUpCmd" : {"called" : NumberLong(0),"successful" : NumberLong(0)},"priorityTakeover" : {"called" : NumberLong(0),"successful" : NumberLong(0)},"catchUpTakeover" : {"called" : NumberLong(0),"successful" : NumberLong(0)},"electionTimeout" : {"called" : NumberLong(0),"successful" : NumberLong(0)},"freezeTimeout" : {"called" : NumberLong(0),"successful" : NumberLong(0)},"numStepDownsCausedByHigherTerm" : NumberLong(0),"numCatchUps" : NumberLong(0),"numCatchUpsSucceeded" : NumberLong(0),"numCatchUpsAlreadyCaughtUp" : NumberLong(0),"numCatchUpsSkipped" : NumberLong(0),"numCatchUpsTimedOut" : NumberLong(0),"numCatchUpsFailedWithError" : NumberLong(0),"numCatchUpsFailedWithNewTerm" : NumberLong(0),"numCatchUpsFailedWithReplSetAbortPrimaryCatchUpCmd" : NumberLong(0),"averageCatchUpOps" : 0},"extra_info" : { #提供数据库系统底层相关信息的统计报告"note" : "fields vary by platform","user_time_us" : NumberLong(1180422),"system_time_us" : NumberLong(3063335),"maximum_resident_set_kb" : NumberLong(82932),"input_blocks" : NumberLong(14352),"output_blocks" : NumberLong(2736),"page_reclaims" : NumberLong(8378),"page_faults" : NumberLong(2),"voluntary_context_switches" : NumberLong(18599),"involuntary_context_switches" : NumberLong(33)},"flowControl" : { #流控情况"enabled" : true,"targetRateLimit" : 1000000000,"timeAcquiringMicros" : NumberLong(20),"locksPerOp" : 0,"sustainerRate" : 0,"isLagged" : false,"isLaggedCount" : 0,"isLaggedTimeMicros" : NumberLong(0)},"freeMonitoring" : {"state" : "undecided"},"globalLock" : { #提供数据库锁定状态的详细统计报告"totalTime" : NumberLong(527407000),"currentQueue" : {"total" : 0,"readers" : 0,"writers" : 0},"activeClients" : {"total" : 0,"readers" : 0,"writers" : 0}},"locks" : { #对每个锁详细信息进行统计报告"ParallelBatchWriterMode" : {"acquireCount" : {"r" : NumberLong(32)}},"ReplicationStateTransition" : {"acquireCount" : {"w" : NumberLong(1625)}},"Global" : {"acquireCount" : {"r" : NumberLong(1609),"w" : NumberLong(12),"W" : NumberLong(4)}},"Database" : {"acquireCount" : {"r" : NumberLong(556),"w" : NumberLong(9),"W" : NumberLong(3)}},"Collection" : {"acquireCount" : {"r" : NumberLong(592),"w" : NumberLong(9)}},"Mutex" : {"acquireCount" : {"r" : NumberLong(570)}},"oplog" : {"acquireCount" : {"r" : NumberLong(528)}}},"logicalSessionRecordCache" : {"activeSessionsCount" : 1,"sessionsCollectionJobCount" : 2,"lastSessionsCollectionJobDurationMillis" : 12,"lastSessionsCollectionJobTimestamp" : ISODate("2020-02-01T07:22:07.417Z"),"lastSessionsCollectionJobEntriesRefreshed" : 1,"lastSessionsCollectionJobEntriesEnded" : 0,"lastSessionsCollectionJobCursorsClosed" : 0,"transactionReaperJobCount" : 2,"lastTransactionReaperJobDurationMillis" : 0,"lastTransactionReaperJobTimestamp" : ISODate("2020-02-01T07:22:07.417Z"),"lastTransactionReaperJobEntriesCleanedUp" : 0,"sessionCatalogSize" : 0},"network" : { #MongoDB网络使用情况统计报告"bytesIn" : NumberLong(2233),"bytesOut" : NumberLong(44003),"physicalBytesIn" : NumberLong(2233),"physicalBytesOut" : NumberLong(44003),"numRequests" : NumberLong(19),"compression" : {"snappy" : {"compressor" : {"bytesIn" : NumberLong(0),"bytesOut" : NumberLong(0)},"decompressor" : {"bytesIn" : NumberLong(0),"bytesOut" : NumberLong(0)}},"zstd" : {"compressor" : {"bytesIn" : NumberLong(0),"bytesOut" : NumberLong(0)},"decompressor" : {"bytesIn" : NumberLong(0),"bytesOut" : NumberLong(0)}},"zlib" : {"compressor" : {"bytesIn" : NumberLong(0),"bytesOut" : NumberLong(0)},"decompressor" : {"bytesIn" : NumberLong(0),"bytesOut" : NumberLong(0)}}},"serviceExecutorTaskStats" : {"executor" : "passthrough","threadsRunning" : 1}},"opLatencies" : { #数据库操作延迟内容文档报告"reads" : {"latency" : NumberLong(0),"ops" : NumberLong(0)},"writes" : {"latency" : NumberLong(0),"ops" : NumberLong(0)},"commands" : {"latency" : NumberLong(58311),"ops" : NumberLong(18)},"transactions" : {"latency" : NumberLong(0),"ops" : NumberLong(0)}},"opReadConcernCounters" : {"available" : NumberLong(0),"linearizable" : NumberLong(0),"local" : NumberLong(0),"majority" : NumberLong(0),"snapshot" : NumberLong(0),"none" : NumberLong(3)},"opcounters" : { #自数据库启动以来分类对数据库操作进行统计报告"insert" : NumberLong(0),"query" : NumberLong(3),"update" : NumberLong(1),"delete" : NumberLong(0),"getmore" : NumberLong(0),"command" : NumberLong(23)},"opcountersRepl" : { #自数据库启动以来,分类统计复制操作并形成报告,用于副本集。"insert" : NumberLong(0),"query" : NumberLong(0),"update" : NumberLong(0),"delete" : NumberLong(0),"getmore" : NumberLong(0),"command" : NumberLong(0)},"storageEngine" : { #存储引擎"name" : "wiredTiger","supportsCommittedReads" : true,"oldestRequiredTimestampForCrashRecovery" : Timestamp(0, 0),"supportsPendingDrops" : true,"dropPendingIdents" : NumberLong(0),"supportsSnapshotReadConcern" : true,"readOnly" : false,"persistent" : true,"backupCursorOpen" : false},"tcmalloc" : {"generic" : {"current_allocated_bytes" : 95329872,"heap_size" : 98521088},"tcmalloc" : {"pageheap_free_bytes" : 2093056,"pageheap_unmapped_bytes" : 0,"max_total_thread_cache_bytes" : 514850816,"current_total_thread_cache_bytes" : 851688,"total_free_bytes" : 1098160,"central_cache_free_bytes" : 151240,"transfer_cache_free_bytes" : 95232,"thread_cache_free_bytes" : 851688,"aggressive_memory_decommit" : 0,"pageheap_committed_bytes" : 98521088,"pageheap_scavenge_count" : 0,"pageheap_commit_count" : 51,"pageheap_total_commit_bytes" : 98521088,"pageheap_decommit_count" : 0,"pageheap_total_decommit_bytes" : 0,"pageheap_reserve_count" : 51,"pageheap_total_reserve_bytes" : 98521088,"spinlock_total_delay_ns" : 0,"formattedString" : "------------------------------------------------\nMALLOC: 95330448 ( 90.9 MiB) Bytes in use by application\nMALLOC: + 2093056 ( 2.0 MiB) Bytes in page heap freelist\nMALLOC: + 151240 ( 0.1 MiB) Bytes in central cache freelist\nMALLOC: + 95232 ( 0.1 MiB) Bytes in transfer cache freelist\nMALLOC: + 851112 ( 0.8 MiB) Bytes in thread cache freelists\nMALLOC: + 2752512 ( 2.6 MiB) Bytes in malloc metadata\nMALLOC: ------------\nMALLOC: = 101273600 ( 96.6 MiB) Actual memory used (physical + swap)\nMALLOC: + 0 ( 0.0 MiB) Bytes released to OS (aka unmapped)\nMALLOC: ------------\nMALLOC: = 101273600 ( 96.6 MiB) Virtual address space used\nMALLOC:\nMALLOC: 690 Spans in use\nMALLOC: 27 Thread heaps in use\nMALLOC: 4096 Tcmalloc page size\n------------------------------------------------\nCall ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).\nBytes released to the OS take up virtual address space but no physical memory.\n"}},"trafficRecording" : {"running" : false},"transactions" : {"retriedCommandsCount" : NumberLong(0),"retriedStatementsCount" : NumberLong(0),"transactionsCollectionWriteCount" : NumberLong(0),"currentActive" : NumberLong(0),"currentInactive" : NumberLong(0),"currentOpen" : NumberLong(0),"totalAborted" : NumberLong(0),"totalCommitted" : NumberLong(0),"totalStarted" : NumberLong(0),"totalPrepared" : NumberLong(0),"totalPreparedThenCommitted" : NumberLong(0),"totalPreparedThenAborted" : NumberLong(0),"currentPrepared" : NumberLong(0)},"transportSecurity" : {"1.0" : NumberLong(0),"1.1" : NumberLong(0),"1.2" : NumberLong(0),"1.3" : NumberLong(0),"unknown" : NumberLong(0)},"twoPhaseCommitCoordinator" : {"totalCreated" : NumberLong(0),"totalStartedTwoPhaseCommit" : NumberLong(0),"totalAbortedTwoPhaseCommit" : NumberLong(0),"totalCommittedTwoPhaseCommit" : NumberLong(0),"currentInSteps" : {"writingParticipantList" : NumberLong(0),"waitingForVotes" : NumberLong(0),"writingDecision" : NumberLong(0),"waitingForDecisionAcks" : NumberLong(0),"deletingCoordinatorDoc" : NumberLong(0)}},"wiredTiger" : { #wiredTiger引擎详细使用信息。"uri" : "statistics:","async" : {"current work queue length" : 0,"maximum work queue length" : 0,"number of allocation state races" : 0,"number of flush calls" : 0,"number of operation slots viewed for allocation" : 0,"number of times operation allocation failed" : 0,"number of times worker found no work" : 0,"total allocations" : 0,"total compact calls" : 0,"total insert calls" : 0,"total remove calls" : 0,"total search calls" : 0,"total update calls" : 0},"block-manager" : {"blocks pre-loaded" : 12,"blocks read" : 40,"blocks written" : 63,"bytes read" : 282624,"bytes written" : 512000,"bytes written for checkpoint" : 512000,"mapped blocks read" : 0,"mapped bytes read" : 0},"cache" : {"application threads page read from disk to cache count" : 6,"application threads page read from disk to cache time (usecs)" : 41,"application threads page write from cache to disk count" : 33,"application threads page write from cache to disk time (usecs)" : 2962,"bytes belonging to page images in the cache" : 205911,"bytes belonging to the cache overflow table in the cache" : 182,"bytes currently in the cache" : 236372,"bytes dirty in the cache cumulative" : 231288,"bytes not belonging to page images in the cache" : 30460,"bytes read into cache" : 190659,"bytes written from cache" : 328519,"cache overflow cursor application thread wait time (usecs)" : 0,"cache overflow cursor internal thread wait time (usecs)" : 0,"cache overflow score" : 0,"cache overflow table entries" : 0,"cache overflow table insert calls" : 0,"cache overflow table max on-disk size" : 0,"cache overflow table on-disk size" : 0,"cache overflow table remove calls" : 0,"checkpoint blocked page eviction" : 0,"eviction calls to get a page" : 3,"eviction calls to get a page found queue empty" : 3,"eviction calls to get a page found queue empty after locking" : 0,"eviction currently operating in aggressive mode" : 0,"eviction empty score" : 0,"eviction passes of a file" : 0,"eviction server candidate queue empty when topping up" : 0,"eviction server candidate queue not empty when topping up" : 0,"eviction server evicting pages" : 0,"eviction server slept, because we did not make progress with eviction" : 0,"eviction server unable to reach eviction goal" : 0,"eviction server waiting for a leaf page" : 0,"eviction state" : 128,"eviction walk target pages histogram - 0-9" : 0,"eviction walk target pages histogram - 10-31" : 0,"eviction walk target pages histogram - 128 and higher" : 0,"eviction walk target pages histogram - 32-63" : 0,"eviction walk target pages histogram - 64-128" : 0,"eviction walk target strategy both clean and dirty pages" : 0,"eviction walk target strategy only clean pages" : 0,"eviction walk target strategy only dirty pages" : 0,"eviction walks abandoned" : 0,"eviction walks gave up because they restarted their walk twice" : 0,"eviction walks gave up because they saw too many pages and found no candidates" : 0,"eviction walks gave up because they saw too many pages and found too few candidates" : 0,"eviction walks reached end of tree" : 0,"eviction walks started from root of tree" : 0,"eviction walks started from saved location in tree" : 0,"eviction worker thread active" : 4,"eviction worker thread created" : 0,"eviction worker thread evicting pages" : 0,"eviction worker thread removed" : 0,"eviction worker thread stable number" : 0,"files with active eviction walks" : 0,"files with new eviction walks started" : 0,"force re-tuning of eviction workers once in a while" : 0,"forced eviction - pages evicted that were clean count" : 0,"forced eviction - pages evicted that were clean time (usecs)" : 0,"forced eviction - pages evicted that were dirty count" : 0,"forced eviction - pages evicted that were dirty time (usecs)" : 0,"forced eviction - pages selected because of too many deleted items count" : 0,"forced eviction - pages selected count" : 0,"forced eviction - pages selected unable to be evicted count" : 0,"forced eviction - pages selected unable to be evicted time" : 0,"hazard pointer blocked page eviction" : 0,"hazard pointer check calls" : 0,"hazard pointer check entries walked" : 0,"hazard pointer maximum array length" : 0,"in-memory page passed criteria to be split" : 0,"in-memory page splits" : 0,"internal pages evicted" : 0,"internal pages queued for eviction" : 0,"internal pages seen by eviction walk" : 0,"internal pages seen by eviction walk that are already queued" : 0,"internal pages split during eviction" : 0,"leaf pages split during eviction" : 0,"maximum bytes configured" : 1523580928,"maximum page size at eviction" : 0,"modified pages evicted" : 0,"modified pages evicted by application threads" : 0,"operations timed out waiting for space in cache" : 0,"overflow pages read into cache" : 0,"page split during eviction deepened the tree" : 0,"page written requiring cache overflow records" : 0,"pages currently held in the cache" : 26,"pages evicted by application threads" : 0,"pages queued for eviction" : 0,"pages queued for eviction post lru sorting" : 0,"pages queued for urgent eviction" : 0,"pages queued for urgent eviction during walk" : 0,"pages read into cache" : 19,"pages read into cache after truncate" : 3,"pages read into cache after truncate in prepare state" : 0,"pages read into cache requiring cache overflow entries" : 0,"pages read into cache requiring cache overflow for checkpoint" : 0,"pages read into cache skipping older cache overflow entries" : 0,"pages read into cache with skipped cache overflow entries needed later" : 0,"pages read into cache with skipped cache overflow entries needed later by checkpoint" : 0,"pages requested from the cache" : 1319,"pages seen by eviction walk" : 0,"pages seen by eviction walk that are already queued" : 0,"pages selected for eviction unable to be evicted" : 0,"pages selected for eviction unable to be evicted as the parent page has overflow items" : 0,"pages selected for eviction unable to be evicted because of active children on an internal page" : 0,"pages selected for eviction unable to be evicted because of failure in reconciliation" : 0,"pages selected for eviction unable to be evicted due to newer modifications on a clean page" : 0,"pages walked for eviction" : 0,"pages written from cache" : 34,"pages written requiring in-memory restoration" : 0,"percentage overhead" : 8,"tracked bytes belonging to internal pages in the cache" : 4672,"tracked bytes belonging to leaf pages in the cache" : 231700,"tracked dirty bytes in the cache" : 0,"tracked dirty pages in the cache" : 0,"unmodified pages evicted" : 0},"capacity" : {"background fsync file handles considered" : 0,"background fsync file handles synced" : 0,"background fsync time (msecs)" : 0,"bytes read" : 196608,"bytes written for checkpoint" : 284627,"bytes written for eviction" : 0,"bytes written for log" : 1258265984,"bytes written total" : 1258550611,"threshold to call fsync" : 0,"time waiting due to total capacity (usecs)" : 0,"time waiting during checkpoint (usecs)" : 0,"time waiting during eviction (usecs)" : 0,"time waiting during logging (usecs)" : 0,"time waiting during read (usecs)" : 0},"connection" : {"auto adjusting condition resets" : 40,"auto adjusting condition wait calls" : 3268,"detected system time went backwards" : 0,"files currently open" : 14,"memory allocations" : 22271,"memory frees" : 21286,"memory re-allocations" : 2800,"pthread mutex condition wait calls" : 8703,"pthread mutex shared lock read-lock calls" : 6405,"pthread mutex shared lock write-lock calls" : 654,"total fsync I/Os" : 51,"total read I/Os" : 1302,"total write I/Os" : 87},"cursor" : {"cached cursor count" : 13,"cursor bulk loaded cursor insert calls" : 0,"cursor close calls that result in cache" : 33,"cursor create calls" : 295,"cursor insert calls" : 16,"cursor insert key and value bytes" : 11847,"cursor modify calls" : 0,"cursor modify key and value bytes affected" : 0,"cursor modify value bytes modified" : 0,"cursor next calls" : 434,"cursor operation restarted" : 0,"cursor prev calls" : 8,"cursor remove calls" : 1,"cursor remove key bytes removed" : 22,"cursor reserve calls" : 0,"cursor reset calls" : 1218,"cursor search calls" : 1236,"cursor search near calls" : 15,"cursor sweep buckets" : 96,"cursor sweep cursors closed" : 0,"cursor sweep cursors examined" : 1,"cursor sweeps" : 16,"cursor truncate calls" : 0,"cursor update calls" : 0,"cursor update key and value bytes" : 0,"cursor update value size change" : 0,"cursors reused from cache" : 19,"open cursor count" : 22},"data-handle" : {"connection data handle size" : 432,"connection data handles currently active" : 20,"connection sweep candidate became referenced" : 0,"connection sweep dhandles closed" : 0,"connection sweep dhandles removed from hash list" : 8,"connection sweep time-of-death sets" : 41,"connection sweeps" : 52,"session dhandles swept" : 0,"session sweep attempts" : 107},"lock" : {"checkpoint lock acquisitions" : 9,"checkpoint lock application thread wait time (usecs)" : 0,"checkpoint lock internal thread wait time (usecs)" : 0,"dhandle lock application thread time waiting (usecs)" : 0,"dhandle lock internal thread time waiting (usecs)" : 0,"dhandle read lock acquisitions" : 2171,"dhandle write lock acquisitions" : 36,"durable timestamp queue lock application thread time waiting (usecs)" : 0,"durable timestamp queue lock internal thread time waiting (usecs)" : 0,"durable timestamp queue read lock acquisitions" : 0,"durable timestamp queue write lock acquisitions" : 0,"metadata lock acquisitions" : 7,"metadata lock application thread wait time (usecs)" : 0,"metadata lock internal thread wait time (usecs)" : 0,"read timestamp queue lock application thread time waiting (usecs)" : 0,"read timestamp queue lock internal thread time waiting (usecs)" : 0,"read timestamp queue read lock acquisitions" : 0,"read timestamp queue write lock acquisitions" : 0,"schema lock acquisitions" : 29,"schema lock application thread wait time (usecs)" : 0,"schema lock internal thread wait time (usecs)" : 0,"table lock application thread time waiting for the table lock (usecs)" : 0,"table lock internal thread time waiting for the table lock (usecs)" : 0,"table read lock acquisitions" : 0,"table write lock acquisitions" : 11,"txn global lock application thread time waiting (usecs)" : 0,"txn global lock internal thread time waiting (usecs)" : 0,"txn global read lock acquisitions" : 27,"txn global write lock acquisitions" : 21},"log" : {"busy returns attempting to switch slots" : 0,"force archive time sleeping (usecs)" : 0,"log bytes of payload data" : 6925,"log bytes written" : 9728,"log files manually zero-filled" : 0,"log flush operations" : 5163,"log force write operations" : 5730,"log force write operations skipped" : 5727,"log records compressed" : 6,"log records not compressed" : 1,"log records too small to compress" : 19,"log release advances write LSN" : 9,"log scan operations" : 6,"log scan records requiring two reads" : 0,"log server thread advances write LSN" : 3,"log server thread write LSN walk skipped" : 1505,"log sync operations" : 12,"log sync time duration (usecs)" : 6699,"log sync_dir operations" : 1,"log sync_dir time duration (usecs)" : 1,"log write operations" : 26,"logging bytes consolidated" : 9216,"maximum log file size" : 104857600,"number of pre-allocated log files to create" : 2,"pre-allocated log files not ready and missed" : 1,"pre-allocated log files prepared" : 2,"pre-allocated log files used" : 0,"records processed by log scan" : 13,"slot close lost race" : 0,"slot close unbuffered waits" : 0,"slot closures" : 12,"slot join atomic update races" : 0,"slot join calls atomic updates raced" : 0,"slot join calls did not yield" : 26,"slot join calls found active slot closed" : 0,"slot join calls slept" : 0,"slot join calls yielded" : 0,"slot join found active slot closed" : 0,"slot joins yield time (usecs)" : 0,"slot transitions unable to find free slot" : 0,"slot unbuffered writes" : 0,"total in-memory size of compressed records" : 11665,"total log buffer size" : 33554432,"total size of compressed records" : 6140,"written slots coalesced" : 0,"yields waiting for previous log file close" : 0},"perf" : {"file system read latency histogram (bucket 1) - 10-49ms" : 7,"file system read latency histogram (bucket 2) - 50-99ms" : 0,"file system read latency histogram (bucket 3) - 100-249ms" : 0,"file system read latency histogram (bucket 4) - 250-499ms" : 0,"file system read latency histogram (bucket 5) - 500-999ms" : 0,"file system read latency histogram (bucket 6) - 1000ms+" : 0,"file system write latency histogram (bucket 1) - 10-49ms" : 0,"file system write latency histogram (bucket 2) - 50-99ms" : 0,"file system write latency histogram (bucket 3) - 100-249ms" : 0,"file system write latency histogram (bucket 4) - 250-499ms" : 0,"file system write latency histogram (bucket 5) - 500-999ms" : 0,"file system write latency histogram (bucket 6) - 1000ms+" : 0,"operation read latency histogram (bucket 1) - 100-249us" : 0,"operation read latency histogram (bucket 2) - 250-499us" : 0,"operation read latency histogram (bucket 3) - 500-999us" : 0,"operation read latency histogram (bucket 4) - 1000-9999us" : 0,"operation read latency histogram (bucket 5) - 10000us+" : 0,"operation write latency histogram (bucket 1) - 100-249us" : 0,"operation write latency histogram (bucket 2) - 250-499us" : 0,"operation write latency histogram (bucket 3) - 500-999us" : 0,"operation write latency histogram (bucket 4) - 1000-9999us" : 0,"operation write latency histogram (bucket 5) - 10000us+" : 0},"reconciliation" : {"fast-path pages deleted" : 0,"page reconciliation calls" : 36,"page reconciliation calls for eviction" : 0,"pages deleted" : 2,"split bytes currently awaiting free" : 0,"split objects currently awaiting free" : 0},"session" : {"open session count" : 19,"session query timestamp calls" : 0,"table alter failed calls" : 0,"table alter successful calls" : 0,"table alter unchanged and skipped" : 0,"table compact failed calls" : 0,"table compact successful calls" : 0,"table create failed calls" : 0,"table create successful calls" : 1,"table drop failed calls" : 0,"table drop successful calls" : 0,"table import failed calls" : 0,"table import successful calls" : 0,"table rebalance failed calls" : 0,"table rebalance successful calls" : 0,"table rename failed calls" : 0,"table rename successful calls" : 0,"table salvage failed calls" : 0,"table salvage successful calls" : 0,"table truncate failed calls" : 0,"table truncate successful calls" : 0,"table verify failed calls" : 0,"table verify successful calls" : 0},"thread-state" : {"active filesystem fsync calls" : 0,"active filesystem read calls" : 0,"active filesystem write calls" : 0},"thread-yield" : {"application thread time evicting (usecs)" : 0,"application thread time waiting for cache (usecs)" : 0,"connection close blocked waiting for transaction state stabilization" : 0,"connection close yielded for lsm manager shutdown" : 0,"data handle lock yielded" : 0,"get reference for page index and slot time sleeping (usecs)" : 0,"log server sync yielded for log write" : 0,"page access yielded due to prepare state change" : 0,"page acquire busy blocked" : 0,"page acquire eviction blocked" : 0,"page acquire locked blocked" : 0,"page acquire read blocked" : 0,"page acquire time sleeping (usecs)" : 0,"page delete rollback time sleeping for state change (usecs)" : 0,"page reconciliation yielded due to child modification" : 0},"transaction" : {"Number of prepared updates" : 0,"Number of prepared updates added to cache overflow" : 0,"durable timestamp queue entries walked" : 0,"durable timestamp queue insert to empty" : 0,"durable timestamp queue inserts to head" : 0,"durable timestamp queue inserts total" : 0,"durable timestamp queue length" : 0,"number of named snapshots created" : 0,"number of named snapshots dropped" : 0,"prepared transactions" : 0,"prepared transactions committed" : 0,"prepared transactions currently active" : 0,"prepared transactions rolled back" : 0,"query timestamp calls" : 527,"read timestamp queue entries walked" : 0,"read timestamp queue insert to empty" : 0,"read timestamp queue inserts to head" : 0,"read timestamp queue inserts total" : 0,"read timestamp queue length" : 0,"rollback to stable calls" : 0,"rollback to stable updates aborted" : 0,"rollback to stable updates removed from cache overflow" : 0,"set timestamp calls" : 0,"set timestamp durable calls" : 0,"set timestamp durable updates" : 0,"set timestamp oldest calls" : 0,"set timestamp oldest updates" : 0,"set timestamp stable calls" : 0,"set timestamp stable updates" : 0,"transaction begins" : 35,"transaction checkpoint currently running" : 0,"transaction checkpoint generation" : 8,"transaction checkpoint max time (msecs)" : 11,"transaction checkpoint min time (msecs)" : 2,"transaction checkpoint most recent time (msecs)" : 2,"transaction checkpoint scrub dirty target" : 0,"transaction checkpoint scrub time (msecs)" : 0,"transaction checkpoint total time (msecs)" : 35,"transaction checkpoints" : 9,"transaction checkpoints skipped because database was clean" : 2,"transaction failures due to cache overflow" : 0,"transaction fsync calls for checkpoint after allocating the transaction ID" : 7,"transaction fsync duration for checkpoint after allocating the transaction ID (usecs)" : 0,"transaction range of IDs currently pinned" : 0,"transaction range of IDs currently pinned by a checkpoint" : 0,"transaction range of IDs currently pinned by named snapshots" : 0,"transaction range of timestamps currently pinned" : 0,"transaction range of timestamps pinned by a checkpoint" : 0,"transaction range of timestamps pinned by the oldest active read timestamp" : 0,"transaction range of timestamps pinned by the oldest timestamp" : 0,"transaction read timestamp of the oldest active reader" : 0,"transaction sync calls" : 0,"transactions committed" : 4,"transactions rolled back" : 31,"update conflicts" : 0},"concurrentTransactions" : {"write" : {"out" : 0,"available" : 128,"totalTickets" : 128},"read" : {"out" : 1,"available" : 127,"totalTickets" : 128}},"snapshot-window-settings" : {"cache pressure percentage threshold" : 95,"current cache pressure percentage" : NumberLong(0),"total number of SnapshotTooOld errors" : NumberLong(0),"max target available snapshots window size in seconds" : 5,"target available snapshots window size in seconds" : 5,"current available snapshots window size in seconds" : 0,"latest majority snapshot timestamp available" : "Jan 1 08:00:00:0","oldest majority snapshot timestamp available" : "Jan 1 08:00:00:0"}},"mem" : { #对MongoDB和当前内存使用情况的统计报告"bits" : 64,"resident" : 80,"virtual" : 1513,"supported" : true},"metrics" : { #当前MongoDB实例运行的各种状态的统计报告"commands" : {"buildInfo" : {"failed" : NumberLong(0),"total" : NumberLong(3)},"dbStats" : {"failed" : NumberLong(0),"total" : NumberLong(1)},"find" : {"failed" : NumberLong(0),"total" : NumberLong(3)},"getCmdLineOpts" : {"failed" : NumberLong(0),"total" : NumberLong(1)},"getFreeMonitoringStatus" : {"failed" : NumberLong(0),"total" : NumberLong(1)},"getLog" : {"failed" : NumberLong(0),"total" : NumberLong(1)},"isMaster" : {"failed" : NumberLong(0),"total" : NumberLong(5)},"listCollections" : {"failed" : NumberLong(0),"total" : NumberLong(3)},"listIndexes" : {"failed" : NumberLong(0),"total" : NumberLong(4)},"replSetGetStatus" : {"failed" : NumberLong(1),"total" : NumberLong(1)},"serverStatus" : {"failed" : NumberLong(0),"total" : NumberLong(2)},"update" : {"failed" : NumberLong(0),"total" : NumberLong(1)},"whatsmyuri" : {"failed" : NumberLong(0),"total" : NumberLong(1)}},"cursor" : {"timedOut" : NumberLong(0),"open" : {"noTimeout" : NumberLong(0),"pinned" : NumberLong(0),"total" : NumberLong(0)}},"document" : {"deleted" : NumberLong(0),"inserted" : NumberLong(0),"returned" : NumberLong(0),"updated" : NumberLong(0)},"getLastError" : {"wtime" : {"num" : 0,"totalMillis" : 0},"wtimeouts" : NumberLong(0)},"operation" : {"scanAndOrder" : NumberLong(0),"writeConflicts" : NumberLong(0)},"query" : {"planCacheTotalSizeEstimateBytes" : NumberLong(0),"updateOneOpStyleBroadcastWithExactIDCount" : NumberLong(0)},"queryExecutor" : {"scanned" : NumberLong(0),"scannedObjects" : NumberLong(0)},"record" : {"moves" : NumberLong(0)},"repl" : {"executor" : {"pool" : {"inProgressCount" : 0},"queues" : {"networkInProgress" : 0,"sleepers" : 0},"unsignaledEvents" : 0,"shuttingDown" : false,"networkInterface" : "DEPRECATED: getDiagnosticString is deprecated in NetworkInterfaceTL"},"apply" : {"attemptsToBecomeSecondary" : NumberLong(0),"batchSize" : NumberLong(0),"batches" : {"num" : 0,"totalMillis" : 0},"ops" : NumberLong(0)},"buffer" : {"count" : NumberLong(0),"maxSizeBytes" : NumberLong(0),"sizeBytes" : NumberLong(0)},"initialSync" : {"completed" : NumberLong(0),"failedAttempts" : NumberLong(0),"failures" : NumberLong(0)},"network" : {"bytes" : NumberLong(0),"getmores" : {"num" : 0,"totalMillis" : 0},"notMasterLegacyUnacknowledgedWrites" : NumberLong(0),"notMasterUnacknowledgedWrites" : NumberLong(0),"ops" : NumberLong(0),"readersCreated" : NumberLong(0)},"stepDown" : {"userOperationsKilled" : NumberLong(0),"userOperationsRunning" : NumberLong(0)}},"ttl" : {"deletedDocuments" : NumberLong(0),"passes" : NumberLong(8)}},"ok" : 1}
七. db.stats命令
db.status查看一个数据库实例的运行状态。
> db.stats(){"db" : "test", #数据库名称"collections" : 5, #集合数量"views" : 0,"objects" : 14, #文档对象的个数,所有集合的记录数之和"avgObjSize" : 87.14285714285714, #平均每个对象的大小,通过dataSize/objects得到"dataSize" : 1220, #当前库所有集合的数据大小"storageSize" : 155648, #磁盘存储大小"numExtents" : 0, #所有集合的扩展数据量统计数"indexes" : 9, #已建立索引数量"indexSize" : 229376, #索引大小"scaleFactor" : 1,"fsUsedSize" : 5491126272,"fsTotalSize" : 18238930944,"ok" : 1}
索引查询及优化
在大规模数据存储及查询使用情况下,索引是影响数据库数据使用性能的一个重要技术因素。索引使用成功,可以成倍地提高检索速度;索引使用失败,则会延迟数据库的响应性能,甚至导致业务系统无法正常使用。
1. 注意索引数据对象
索引主要通过对数据对象事先建立排序顺序,以快速实现相关数据的检索和读取。索引建立后查询速度的快慢,与建立索引字段的值的颗粒数紧密相关。在索引字段值中每个数据的重复数量称为颗粒,也叫做索引的基数。基数占整列值中的比重越大,索引效率越低。
假设在一个名叫GoodsInf的集合里,有如下特征的商品记录1万条,在不同键上所建立的索引,效率会明显不一样。假设文档记录中,name键对应的值都不一样,price里的值为50的重复率达到了30%,press值全部重复,那么它们的基数占列数的百分比为:
- _id,0.01%(万分之一),每条都不重复(这是_id键值对的特点)
- name,0.01%(万分之一),每条都不重复(数据输入后产生的特点)
- price,30%(万分之三千),近三分之一的重复率(数据输入后产生的特点)
- press,100%,1万条都重复(数据输入后产生的特点)
{ "_id" : ObjectId("5e353bed68e115bedd289f93"), "name" : "C语言", "price" : 32.5, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353bfe68e115bedd289f94"), "name" : "B语言", "price" : 50, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c1068e115bedd289f95"), "name" : "D语言", "price" : 40, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c1d68e115bedd289f96"), "name" : "E语言", "price" : 50, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c2f68e115bedd289f97"), "name" : "G语言", "price" : 50, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c3c68e115bedd289f98"), "name" : "T语言", "price" : 35, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c4968e115bedd289f99"), "name" : "F语言", "price" : 36.1, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c5668e115bedd289f9a"), "name" : "H语言", "price" : 29, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c6568e115bedd289f9b"), "name" : "I语言", "price" : 50, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c7e68e115bedd289f9c"), "name" : "J语言", "price" : 50, "press" : "水利水电出版社" }...
若要快速查找数据,那么_id和name带索引的字段最快,因为它们的基数占整列比重最小;press带索引情况下最慢,若查找其值,就要把一万条记录都扫描一遍;price带索引情况下,查找price=50的速度介于最快和最慢之间。
根据实际使用经验,在一整列值的基数高度重复的情况下,建立索引意义不大,甚至会影响数据库的读写性能。所以,不应对press这样特征的键值对建立索引,而要对name键建立。利用字段的基数只能对索引的使用做大概参考建议,若要确定一个字段建立索引后,到底对系统响应性能产生好的影响还是坏的影响,最好用Explain()等命令进行对比测试,再做出选择。一般来说,基数超过30%,查询性能会明显下降,所以不要把这种字段作为索引字段。
2. 用Explain()做对比测试
对于在一个集合的某一个字段上是否建立索引,最好的方法是用Explain()命令做对比测试。
接着前面的例子,如果想知道对price是建立索引的查询速度快,还是不建立索引查询速度快?
第一步,在没有索引的情况下,查询该字段的price=50的值,并利用Explain()获取第一次执行的速度
> db.GoodsInf.find({price:50}).explain("executionStats")
第二步,先为price字段建立一个索引,然后查询该字段的price=50的值,并利用Explain()获取第二次执行的速度。
> db.GoodsInf.createIndex({price:1})> db.GoodsInf.find({price:50}).explain("executionStats")
比较2次用时,就可以做出合理的判断,利用Explain()做命令执行速度对比,是一种良好的技术优化手段,可靠性强,是数据库技术人员重视的性能调优方法。
3. 慎用索引功能
为了提高查询性能,并不是索引建立越多越好,而是根据实际需要,慎重建立相关索引,并进行性能模拟测试。
因为,在实际生产过程,有些系统响应速度变的很糟糕,不是由于没有建立索引而引起的,而是由于建立了不合适的索引字段而导致的读写性能急剧下降。
如果建立索引的字段索引基数过大、数据规模超过建立索引的最佳值范围(假设超过一千万条文档,会导致索引开销本身很庞大)、对于修改频繁的字段建立索引可能是件糟糕的事情(在大规模数据的情况下,会导致重新建立索引过程开销变大)等。索引在建立字段索引时要谨慎,在满足操作要求的情况下,一个集合里的索引数量越少越好。
4. 查询只返回需要的字段
假设一个集合有十几个字段,而业务应用时,只需要返回3个字段,那么在具体的代码编写过程,就让返回3个字段,而不能偷懒让全部字段都返回。因为全部字段都返回,会引起网络通信的额外流量开销,在高并发访问的情况下,将引起不必要的网络拥堵问题。例如:
> db.GoodsInf.find({price:50}) #返回集合文档的所有字段{ "_id" : ObjectId("5e353bfe68e115bedd289f94"), "name" : "B语言", "price" : 50, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c1d68e115bedd289f96"), "name" : "E语言", "price" : 50, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c2f68e115bedd289f97"), "name" : "G语言", "price" : 50, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c6568e115bedd289f9b"), "name" : "I语言", "price" : 50, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c7e68e115bedd289f9c"), "name" : "J语言", "price" : 50, "press" : "水利水电出版社" }> db.GoodsInf.find({price:50},{name:1,_id:0}) #显示指出需要返回的字段{ "name" : "B语言" }{ "name" : "E语言" }{ "name" : "G语言" }{ "name" : "I语言" }{ "name" : "J语言" }> db.GoodsInf.find({price:50},{name:1,_id:0}).limit(3) #最多返回3条{ "name" : "B语言" }{ "name" : "E语言" }{ "name" : "G语言" }
5. 使用$inc函数实现服务器端字段值的增减操作
对数值型的更新操作,能用inc函数的,就不用inc函数的,就不用set函数,因为用$inc函数更新速度快。
> db.GoodsInf.updateOne({name:"C语言"},{$set: {"price": 35 }})#在同样修改操作情况下,能用下面的执行代码,就不用上面的修改代码> db.GoodsInf.updateOne({name:"C语言"},{$inc: {"price": 25 }})
6. or和o*r和and使用要点
在相关语句进行多条件匹配操作时,要注意or和or和and的使用方法,以提高条件检索速度。or和or和and条件组合符号,对多条件顺序安排是有要求的,良好的条件顺序安排可以提高检索速度,差的检索条件组合会影响响应性能。
$or 在安排多条件组合检索时,把匹配最多的字段放在最左边位置,匹配第二多的放左边第二的位置,依次类推,这样做可以提高检索速度。
> db.GoodsInf.find({ $or: [ {price:50},{name:"C语言"}]}){ "_id" : ObjectId("5e353bed68e115bedd289f93"), "name" : "C语言", "price" : 75, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353bfe68e115bedd289f94"), "name" : "B语言", "price" : 50, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c1d68e115bedd289f96"), "name" : "E语言", "price" : 50, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c2f68e115bedd289f97"), "name" : "G语言", "price" : 50, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c6568e115bedd289f9b"), "name" : "I语言", "price" : 50, "press" : "水利水电出版社" }{ "_id" : ObjectId("5e353c7e68e115bedd289f9c"), "name" : "J语言", "price" : 50, "press" : "水利水电出版社" }
$and 在安排多条件组合检索时,把匹配最少的字段放在最左边位置,匹配第二少的放左边第二的位置,依次类推,这样做可以提高检索速度。
> db.GoodsInf.find({ $and: [ {name:"B语言"},{"price":50} ]}){ "_id" : ObjectId("5e353bfe68e115bedd289f94"), "name" : "B语言", "price" : 50, "press" : "水利水电出版社" }
or和o*r和and要求这样做是由它们的检索算法决定的。
MongoDB 读写分离
通过MongoDB 数据库系统建立副本集节点,实现主节点数据异步复制到从节点后,会大幅提高数据的安全性和可操作性。但是在默认安装方式下,客户端应用程序只对主节点实现读、写操作,不会直接对从节点进行读写操作。
当主节点读写压力增大后,在一些应用场景,技术人员准备把部分读数据的任务分配给从节点,让主节点专注于写操作,这样可以进一步提高对客户应用程序的响应性能。主节点专注于数据的写入,并把数据复制到从节点,然后从节点为客户应用程序提供数据读支持,这样真正实现了读写分离,提高了服务器节点之间的有效配合和使用效率。
在实际应用中,必须考虑技术实现的要求,才能针对性地应用到合适的业务使用场景。
1. MongoDB所支持的读优化模式
MongoDB提供5种读操作的优化模式
- primary: 默认规则,所有读请求发到Primary,也就是读写都在主节点上操作;
- primaryPreferred: Primary优先,如果Primary不可达,请求Secondary;这种模式可以最大化保证可用性,但是读响应性能提高不会太明显,当从从节点读取数据时,存在读取数据不一致的可能性;
- secondary: 所有的读请求都发到secondary,这是读写分离最彻底的设置模式,可以大幅度提高读的响应性能,但是由于复制延迟的原因,存在读取数据不一致的可能性;
- secondaryPreferred:Secondary优先,当所有Secondary不可达时,请求Primary,存在读取数据不一致的可能性;
- nearest:读请求发送到最小网络延迟的可达节点上(通过ping探测得出最近的节点),这意味着也许从主节点读取数据,也许从从节点读取。该模式保证读取数据时延迟时间最小,但是不能保证数据的一致性。
MongoDB的客户端驱动程序或语言所提供的API接口,都支持对上述模式的读优化操作功能,如C、C++、Java、Python、Go、Lua等。
2. MongoDB读写分离操作
根据MongoDB 官网提供的资料,对读写方式的控制主要通过应用程序端代码的编写进行灵活控制,这里控制的地方可以是建立数据库连接时进行,可以只指定对一个数据库文件进行读取,也可以只指定某一个读命令。另外,也可以通过MongoDB shell平台来实现对相关节点的数据读写操作。
在利用代码或shell平台实现读写分离操作之前,应该先建立好MongoDB集群副本集,至少一个主节点、两个从节点。
2.1 shell平台读操作
通过MongoDB shell平台,在主节点上执行如下代码命令,则查询的是当前主节点的数据库集合内容。
rs0:PRIMARY> db.books.find(){ "_id" : ObjectId("5e33e15836a42c3aa797c72b"), "name" : "C语言基础", "price" : 30 }{ "_id" : ObjectId("5e33e3ed36a42c3aa797c72c"), "name" : "python语言基础", "price" : 45 }
若在从节点执行以上命令,则会出错:
rs0:SECONDARY> db.books.find()Error: error: {"operationTime" : Timestamp(1580568822, 1),"ok" : 0,"errmsg" : "not master and slaveOk=false","code" : 13435,"codeName" : "NotMasterNoSlaveOk","$clusterTime" : {"clusterTime" : Timestamp(1580568822, 1),"signature" : {"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),"keyId" : NumberLong(0)}}}
为了使从节点数据可读,可以在从节点上执行以下命令:
rs0:SECONDARY> db.getMongo().setSlaveOk()rs0:SECONDARY> db.books.find(){ "_id" : ObjectId("5e33e15836a42c3aa797c72b"), "name" : "C语言基础", "price" : 30 }{ "_id" : ObjectId("5e33e3ed36a42c3aa797c72c"), "name" : "python语言基础", "price" : 45 }
或者
rs0:SECONDARY> rs.slaveOk()rs0:SECONDARY> db.books.find(){ "_id" : ObjectId("5e33e15836a42c3aa797c72b"), "name" : "C语言基础", "price" : 30 }{ "_id" : ObjectId("5e33e3ed36a42c3aa797c72c"), "name" : "python语言基础", "price" : 45 }
另外也可直接在find()命令后加readPref()子命令来确定对从节点的查询方法。
> db.books.find().readPref("secondary")
readPref(mode,tagset)的第一个参数mode值为primary、primaryPreferred、secondary、secondaryPreferred或nearest。
第二个可选参数tagset是一种存放访问路径的集合,查询特定内容的从节点。
2.2 异步复制带来的问题
实现了读写分离的设想,但是从节点获取的数据存在数据不一致性,这与MongoDB分布式异步复制原理是相关的。需要准确掌握异步复制的特点,把读写分离用到合适的业务应用场景上,而不能用到不该用的地方。
2.2.1 异步复制原理
这里假设已经建立一个主节点A,两个从节点(B、C)的MongoDB副本集。
- 业务系统从客户端发送一条写入命令,如电子商务平台里的商品评论信息,然后先写入主节点A,A同步把新增的该命令数据写入A节点自己的Oplog.rs封顶集合中;
- 接着,从节点B和C通过心跳功能,得到主节点A数据新增消息,就向主节点提交数据复制请求;
- 主节点把自己记录的Oplog记录分别复制给B、C从节点本地的Oplog.rs封顶集合;
- 从节点将Oplog.rs封顶集合里的最新数据刷新到从节点的数据库中,完成一次数据复制过程
Oplog(operations log)是一个特殊的封顶集合(capped collection,一个固定大小的集合,当它达到其最大大小时自动覆盖其最旧的条目。),它保存修改数据库中存储的数据的所有操作的滚动记录。
Oplog日志集合在MongoDB第一次启动副本集时自动生成,在Unix和Windows系统上,默认大小如下:
| Storage Engine | Default Oplog Size | Lower Bound | Upper Bound |
|---|---|---|---|
| In-Memory Storage Engine | 5% of physical memory | 50 MB | 50 GB |
| WiredTiger Storage Engine | 5% of free disk space | 990 MB | 50 GB |
为了提高插入速度,Oplog采用Capped Collection技术,实现数据记录大小的固定,超过限制上限后,新插入的数据将覆盖最旧的数据。
Oplog日志存放位置:replica sets架构下,local.oplog.rs。
2.2.2 存在问题
问题1:当主节点A向从节点B、C复制过程,受心跳通知影响及数据传输过程的网络状态影响,会产生一定的时间延迟,在这个延迟期间,就会导致主从数据不一致,这个问题是分布式通信本身机制造成的问题,严格来说,所有分布式的数据传输都是异步的,存在时间差。
问题2:从Oplog日志刷新数据到本地数据库过程,也存在一定延迟(默认60秒,可以通过系统配置文件来提高刷新速度,但是会降低系统响应性能)。若发生从节点从本地Oplog日志刷新到本地数据库过程还没结束,Oplog日志记录已经轮滚了一次,那么从节点本地数据库刷新将会跟不上主节点的复制,复制将会停止(可以用db.runCommand({“resync”:1})命令强制重启复制)。这将产生严重的数据不一致性问题。可以采取一些措施,预防该问题发生。
在第一次建立副本集时,指定合适大小Oplog空间。
#配置文件中指定replication:oplogSizeMB: <int> #指定合适的大小replSetName: <string>secondaryIndexPrefetch: <string>enableMajorityReadConcern: <boolean>
在MongoDB生产运行中从新设置Oplog的大小。
先查看当前Oplog大小,单位是字节
use localdb.oplog.rs.stats().maxSize
重设大小,不用重启即生效,单位为M
db.adminCommand({replSetResizeOplog:1, size: 16384})
问题3:当主节点发生故障时,从Oplog复制最新的从节点会被投票选为新的主节点,但是如果存在原先主节点Oplog还有部分内容没有复制到从节点的情况下,当原先主节点故障恢复后,会回滚自己的Oplog数据,以与新主节点Oplog数据保持一致,这个过程是副本集自动切换实现的,因此,将导致一部分数据的丢失。要预防该问题的主要方法如下:
在客户端进行写操作时,利用write concern规则保证写到大多数从节点成功,以避免数据回滚问题的发生。
利用写命令直接限制:
另外一种方法,在主节点的配置文件里设置,这样,所有的插入操作都要求要么复制到大多数从节点,要么提醒插入失败,以保证数据的一致性。db.products.insert({ item: "envelopes", qty : 100, type: "Clasp" },{ writeConcern: { w: "majority" , wtimeout: 5000 } })
> rsconfig=rs.conf()> rsconfig.settings.getLastErrorDefaults = { w: "majority", wtimeout: 5000}> rs.reconfig(rsconfig)
旧primary将回滚的数据写到单独的rollback目录下,数据库技术人员可根据需要使用mongorestore进行恢复。
问题4:生产环境下扩充从节点,导致主节点响应性能急剧下降。当一个全新的从节点加入MongoDB副本集时,新的从节点数据库里的数据库是空的,需要通过复制主节点获取数据,以保证数据的主从一致。这将导致主节点数据库里的数据大量复制到该新从节点,从而产生大量的网络数据流,并加大主节点的读写负担,严重时,会影响业务系统对主节点的访问性能,而这样的复制过程可能持续几个小时甚至几天。为了避免该问题的产生,在新节点接入生产环境下的副本集之前,可以采用以下方法:
- 方法一:通过人工复制主节点的数据到新的从节点(单机状态),然后接入副本集;
- 方法二:利用时间差来同步数据,如利用晚上时间,把新的从节点接入副本集,利用其的初始化机制(Initial Sync),自动同步数据。
2.3 读写分离应用场景
总体来说,只能应用到对数据一致性要求不是非常高的应用场景上,也就是能容忍少数数据的不一致性的业务场景上。
2.3.1 一致性不敏感的数据
具体如大型电子商务平台,存在大量的商品点评信息,在业务上来说,如果从从节点读取某商品的点评信息,发生少数读取数据不一致问题,对于访问信息的顾客是可以容忍的。因为他们感觉不到数据的不一致性,另外对一个商品获得20个点评信息和获取19个点评信息,对顾客来说是差不多的,对顾客更重要的是能快速获取点评信息。
2.3.2 后台统计分析数据
如大型电子商务平台需要定期分析商品浏览排行信息,而该信息作为电子商务平台运行商或平台上的商家所关心的数据。对于该类数据,运行商或商家不会太在意一个商品被点击了50下还是49下,他们更关心商品点击总数的变化和趋势。如果某个商品点击趋势走高,同时该商品的库存下降很快,那么相关决策者就从该数据中得到了有价值的信息,他们可以提早增加存货量,以加快发货速度,甚至可以考虑提高零售价格,因为商品已经供不应求了。在这样的情况下,通过从数据读取并分析数据,将是一个合理的设计。
2.3.3 写入不频繁,读高负荷的场景
如在典型电子商务平台,一些常用商品基本信息,一旦写入完成,后续修改等操作很少,处于相对固定不变的状态。这类数据,同时要承担每天几十万、几百万、甚至几千万人次的访问,访问压力很大,在这样的情况下,采用读写分离是合理的。可以大幅减轻主节点的压力,同时不一致问题出现的可能性非常小(已经在从节点了,从节点到主节点进行复制操作的频度不高)。
