NVMe ketta kasutamine

Allikas: Imre kasutab arvutit
Mine navigeerimisribaleMine otsikasti

Sissejuhatus

Misc

  • Võib olla on tark füüsilise seadme valikul arvestada kuidas kõnealune tootja jaga firmware uuendusi; nt Samsung paistab olevat asjalik -

Tööpõhimõte

TODO

  • Töötab PCIe siinil, st on näha lspci väljundis

Misc

nvme utiilidi paigaldamiseks sobib öelda

# apt-get install nvme-cli

ja kasutamiseks nt

# nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev  
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     50026B728245640E     KINGSTON SKC2000M81000G                  1           1.00  TB /   1.00  TB    512   B +  0 B   S2780101
/dev/nvme1n1     S466NB0K423925X      Samsung SSD 970 EVO 500GB                1         346.22  GB / 500.11  GB    512   B +  0 B   1B2QEXE7

Kingston KC2000

smartctl andmed

# smartctl -a /dev/nvme0n1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.3.13-3-pve] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       KINGSTON SKC2000M81000G
Serial Number:                      50026B728245640E
Firmware Version:                   S2780101
PCI Vendor/Subsystem ID:            0x2646
IEEE OUI Identifier:                0x000000
Controller ID:                      1
Number of Namespaces:               1
Namespace 1 Size/Capacity:          1,000,204,886,016 [1.00 TB]
Namespace 1 Utilization:            0
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            0026b7 28245640e5
Local Time is:                      Sat Feb  8 23:36:52 2020 EET
Firmware Updates (0x14):            2 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size:         64 Pages
Warning  Comp. Temp. Threshold:     75 Celsius
Critical Comp. Temp. Threshold:     80 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     9.00W       -        -    0  0  0  0        0       0
 1 +     4.60W       -        -    1  1  1  1        0       0
 2 +     3.80W       -        -    2  2  2  2        0       0
 3 -   0.0450W       -        -    3  3  3  3     2000    2000
 4 -   0.0040W       -        -    4  4  4  4     6000    8000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0
 1 -    4096       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        31 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    41,160 [21.0 GB]
Data Units Written:                 0
Host Read Commands:                 117,206
Host Write Commands:                0
Controller Busy Time:               0
Power Cycles:                       3
Power On Hours:                     4
Unsafe Shutdowns:                   1
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0

Error Information (NVMe Log 0x01, max 256 entries)
No Errors Logged

Samsung SSD 970 EVO 500GB

Jõudlus

# dd if=/dev/nvme0n1 of=/dev/null bs=1M
476940+1 records in
476940+1 records out
500107862016 bytes (500 GB, 466 GiB) copied, 238.902 s, 2.1 GB/s

lspci väljundis

# lspci | grep -i nvme
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981

scsi seadmete nimekirjas

# lsscsi 
[2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q     /dev/sda 
[4:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q     /dev/sdb 
[9:0:0:0]    disk    LIO-ORG  iscsi_block_1    4.0      /dev/sdc 
[N:0:4:1]    disk    Samsung SSD 970 EVO 500GB__1       /dev/nvme0n1

Ketta smart info

# smartctl -a /dev/nvme0n1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.3.13-3-pve] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       Samsung SSD 970 EVO 500GB
Serial Number:                      S466NB0K423925X
Firmware Version:                   1B2QEXE7
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Total NVM Capacity:                 500,107,862,016 [500 GB]
Unallocated NVM Capacity:           0
Controller ID:                      4
Number of Namespaces:               1
Namespace 1 Size/Capacity:          500,107,862,016 [500 GB]
Namespace 1 Utilization:            354,080,468,992 [354 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            002538 5481b3e78b
Local Time is:                      Sat Feb  8 18:03:58 2020 EET
Firmware Updates (0x16):            3 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size:         512 Pages
Warning  Comp. Temp. Threshold:     85 Celsius
Critical Comp. Temp. Threshold:     85 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     6.20W       -        -    0  0  0  0        0       0
 1 +     4.30W       -        -    1  1  1  1        0       0
 2 +     2.10W       -        -    2  2  2  2        0       0
 3 -   0.0400W       -        -    3  3  3  3      210    1200
 4 -   0.0050W       -        -    4  4  4  4     2000    8000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        35 Celsius
Available Spare:                    98%
Available Spare Threshold:          10%
Percentage Used:                    2%
Data Units Read:                    186,789,370 [95.6 TB]
Data Units Written:                 34,224,216 [17.5 TB]
Host Read Commands:                 3,141,407,873
Host Write Commands:                1,500,589,648
Controller Busy Time:               7,275
Power Cycles:                       78
Power On Hours:                     8,920
Unsafe Shutdowns:                   60
Media and Data Integrity Errors:    12
Error Information Log Entries:      13
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               35 Celsius
Temperature Sensor 2:               37 Celsius

Error Information (NVMe Log 0x01, max 64 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0         13     7  0x0031  0x4502  0x000     98996012     1     -

Vead dmesg väljundis

[ 1121.232279] blk_update_request: critical medium error, dev nvme0n1, sector 98995200 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0

Vigu andva salvestusseadme asendamine töökorras seadmega

Lähtepunktiks on selline olukord

  • arvutis on kahe füüsilise nvme seadme kasutamise võimalus
  • arvutis on üks vigu andev füüsiline nvme seade
# dmesg
..
[Sun Nov 21 19:40:34 2021] blk_update_request: critical medium error, dev nvme0n1, sector 1823429504 op 0x0:(READ) flags 0x0 phys_seg 45 prio class 0
[Sun Nov 21 19:40:34 2021] blk_update_request: critical medium error, dev nvme0n1, sector 1823429632 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
...
  • arvutis on üks töökorras füüsiline nvme seade
  • vigu andev nvme seade on kasutusel raid mdadm ühe õlalises lülituses
  • lisaks on kasutada veel üks töökorras nvme seade

Asendamise protseduur

  • töökorras seade partitsioneeritake selliselt, et üks partitsioon on samasuguse suurusega kui vigu andva seadme mdadm partitsioon
  • ühendada mirror kokku
  • oodatakse kuni mdadm sünkroniseerib (osutub, et mdadm sünk läheb vigadest nö üle)
  • eemaldadakse mirrorist vigu andnud seade
  • lisatakse mirrorisse juurde töökorras teine nvme seade

Tulemusena on arvutis töökorras nvme põhine mdadm lülitus.

Sellise protseduuriga kaasneb huvitav nähtus, kui lugeda medium erroritele vastavaid kohti, siis paistavad nad töökorras ketastelt nii (antud juhul on see proxmox virtuaalsele arvutile vastav ressurss)

root@pm60-trt:~# dd if=/dev/pve_trt2/vm-169-disk-0 of=/dev/null bs=1M
dd: error reading '/dev/pve_trt2/vm-169-disk-0': Input/output error
2246+1 records in
2246+1 records out
2355425280 bytes (2.4 GB, 2.2 GiB) copied, 2.1705 s, 1.1 GB/s

ja dmesg sissekanne

[Mon Nov 22 20:20:10 2021] Buffer I/O error on dev dm-15, logical block 575055, async page read
[Mon Nov 22 20:20:10 2021] Buffer I/O error on dev dm-15, logical block 575055, async page read
[Mon Nov 22 20:20:10 2021] Buffer I/O error on dev dm-15, logical block 575055, async page read

Kusjuures iseenesest saab vigadest lahti kirjutades vastavad plokid üle, nt

root@pm60-trt:~# dd of=/dev/pve_trt2/vm-169-disk-0 if=/dev/zero bs=1M
dd: error writing '/dev/pve_trt2/vm-169-disk-0': No space left on device
65537+0 records in
65536+0 records out
68719476736 bytes (69 GB, 64 GiB) copied, 67.1265 s, 1.0 GB/s

root@pm60-trt:~# dd if=/dev/pve_trt2/vm-169-disk-0 of=/dev/null bs=1M
65536+0 records in
65536+0 records out
68719476736 bytes (69 GB, 64 GiB) copied, 66.5542 s, 1.0 GB/s

Kui kõnealusest virtuaalsest arvutist on vaja andmeid kätte saada, sobib arvuti käivitada nt live linux cd pealt, monteerida failisüsteemid ja kopeerida mida saab; nt rsync paistab minevat ka vigadest mööda.

Kasulikud lisamaterjalid