The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Peranti memori tidak meruap yang boleh dialamatkan bait baru muncul menarik banyak perhatian. Memori utama tidak meruap (NVMM) yang dibina padanya membolehkan saiz memori yang lebih besar dan penggunaan kuasa yang lebih rendah daripada memori utama DRAM tradisional. Untuk menggunakan NVMM sepenuhnya, kedua-dua perisian dan perkakasan mesti dioptimumkan secara bekerjasama. Pada masa yang sama, walaupun memfokuskan pada modul memori, seni bina mikronya masih dibangunkan walaupun modul memori tidak meruap sebenar, seperti memori berterusan Intel Optane DC (DCPMM), telah berada di pasaran. Melihat kepada persekitaran penilaian NVMM sedia ada, simulator perisian boleh menilai pelbagai seni bina mikro dengan masa simulasi yang panjang. Emulator boleh menilai keseluruhan sistem dengan pantas dengan kurang fleksibiliti dalam konfigurasinya berbanding simulator. Oleh itu, emulator NVMM yang dapat merealisasikan penilaian sistem yang fleksibel dan pantas masih mempunyai peranan penting untuk meneroka sistem yang optimum. Dalam kertas kerja ini, kami memperkenalkan emulator NVMM untuk sistem terbenam dan meneroka arah teknik pengoptimuman untuk NVMM dengan menggunakannya. Ia dilaksanakan pada papan SoC-FPGA yang menggunakan tiga model tingkah laku NVMM: butiran kasar, butiran halus dan berasaskan DCPMM. Model kasar dan halus membolehkan penilaian prestasi NVMM berdasarkan lanjutan gelagat DRAM tradisional. Model berasaskan DCPMM meniru gelagat DCPMM sebenar. Persekitaran penilaian keseluruhan juga disediakan termasuk pengubahsuaian kernel Linux dan beberapa fungsi runtime. Kami mula-mula mengesahkan emulator yang dibangunkan dengan emulator NVMM sedia ada, simulator NVMM tepat kitaran dan DCPMM sebenar. Kemudian, perbezaan tingkah laku program antara tiga model dinilai dengan program CPU SPEC. Akibatnya, model butiran halus mendedahkan masa pelaksanaan program dipengaruhi oleh kekerapan permintaan memori NVMM dan bukannya nisbah hit cache. Berbanding dengan model butiran halus dan model butiran kasar di bawah keadaan kependaman jumlah tulis yang lebih lama berbanding model yang kedua, model yang pertama menunjukkan masa pelaksanaan yang lebih rendah untuk empat daripada empat belas program daripada yang kedua kerana keselarian peringkat bank dan lokaliti akses row-buffer yang dieksploitasi oleh model terdahulu.
Yu OMORI
Waseda University
Keiji KIMURA
Waseda University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Yu OMORI, Keiji KIMURA, "Non-Volatile Main Memory Emulator for Embedded Systems Employing Three NVMM Behaviour Models" in IEICE TRANSACTIONS on Information,
vol. E104-D, no. 5, pp. 697-708, May 2021, doi: 10.1587/transinf.2020EDP7092.
Abstract: Emerging byte-addressable non-volatile memory devices attract much attention. A non-volatile main memory (NVMM) built on them enables larger memory size and lower power consumption than a traditional DRAM main memory. To fully utilize an NVMM, both software and hardware must be cooperatively optimized. Simultaneously, even focusing on a memory module, its micro architecture is still being developed though real non-volatile memory modules, such as Intel Optane DC persistent memory (DCPMM), have been on the market. Looking at existing NVMM evaluation environments, software simulators can evaluate various micro architectures with their long simulation time. Emulators can evaluate the whole system fast with less flexibility in their configuration than simulators. Thus, an NVMM emulator that can realize flexible and fast system evaluation still has an important role to explore the optimal system. In this paper, we introduce an NVMM emulator for embedded systems and explore a direction of optimization techniques for NVMMs by using it. It is implemented on an SoC-FPGA board employing three NVMM behaviour models: coarse-grain, fine-grain and DCPMM-based. The coarse and fine models enable NVMM performance evaluations based on extensions of traditional DRAM behaviour. The DCPMM-based model emulates the behaviour of a real DCPMM. Whole evaluation environment is also provided including Linux kernel modifications and several runtime functions. We first validate the developed emulator with an existing NVMM emulator, a cycle-accurate NVMM simulator and a real DCPMM. Then, the program behavior differences among three models are evaluated with SPEC CPU programs. As a result, the fine-grain model reveals the program execution time is affected by the frequency of NVMM memory requests rather than the cache hit ratio. Comparing with the fine-grain model and the coarse-grain model under the condition of the former's longer total write latency than the latter's, the former shows lower execution time for four of fourteen programs than the latter because of the bank-level parallelism and the row-buffer access locality exploited by the former model.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2020EDP7092/_p
Salinan
@ARTICLE{e104-d_5_697,
author={Yu OMORI, Keiji KIMURA, },
journal={IEICE TRANSACTIONS on Information},
title={Non-Volatile Main Memory Emulator for Embedded Systems Employing Three NVMM Behaviour Models},
year={2021},
volume={E104-D},
number={5},
pages={697-708},
abstract={Emerging byte-addressable non-volatile memory devices attract much attention. A non-volatile main memory (NVMM) built on them enables larger memory size and lower power consumption than a traditional DRAM main memory. To fully utilize an NVMM, both software and hardware must be cooperatively optimized. Simultaneously, even focusing on a memory module, its micro architecture is still being developed though real non-volatile memory modules, such as Intel Optane DC persistent memory (DCPMM), have been on the market. Looking at existing NVMM evaluation environments, software simulators can evaluate various micro architectures with their long simulation time. Emulators can evaluate the whole system fast with less flexibility in their configuration than simulators. Thus, an NVMM emulator that can realize flexible and fast system evaluation still has an important role to explore the optimal system. In this paper, we introduce an NVMM emulator for embedded systems and explore a direction of optimization techniques for NVMMs by using it. It is implemented on an SoC-FPGA board employing three NVMM behaviour models: coarse-grain, fine-grain and DCPMM-based. The coarse and fine models enable NVMM performance evaluations based on extensions of traditional DRAM behaviour. The DCPMM-based model emulates the behaviour of a real DCPMM. Whole evaluation environment is also provided including Linux kernel modifications and several runtime functions. We first validate the developed emulator with an existing NVMM emulator, a cycle-accurate NVMM simulator and a real DCPMM. Then, the program behavior differences among three models are evaluated with SPEC CPU programs. As a result, the fine-grain model reveals the program execution time is affected by the frequency of NVMM memory requests rather than the cache hit ratio. Comparing with the fine-grain model and the coarse-grain model under the condition of the former's longer total write latency than the latter's, the former shows lower execution time for four of fourteen programs than the latter because of the bank-level parallelism and the row-buffer access locality exploited by the former model.},
keywords={},
doi={10.1587/transinf.2020EDP7092},
ISSN={1745-1361},
month={May},}
Salinan
TY - JOUR
TI - Non-Volatile Main Memory Emulator for Embedded Systems Employing Three NVMM Behaviour Models
T2 - IEICE TRANSACTIONS on Information
SP - 697
EP - 708
AU - Yu OMORI
AU - Keiji KIMURA
PY - 2021
DO - 10.1587/transinf.2020EDP7092
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E104-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2021
AB - Emerging byte-addressable non-volatile memory devices attract much attention. A non-volatile main memory (NVMM) built on them enables larger memory size and lower power consumption than a traditional DRAM main memory. To fully utilize an NVMM, both software and hardware must be cooperatively optimized. Simultaneously, even focusing on a memory module, its micro architecture is still being developed though real non-volatile memory modules, such as Intel Optane DC persistent memory (DCPMM), have been on the market. Looking at existing NVMM evaluation environments, software simulators can evaluate various micro architectures with their long simulation time. Emulators can evaluate the whole system fast with less flexibility in their configuration than simulators. Thus, an NVMM emulator that can realize flexible and fast system evaluation still has an important role to explore the optimal system. In this paper, we introduce an NVMM emulator for embedded systems and explore a direction of optimization techniques for NVMMs by using it. It is implemented on an SoC-FPGA board employing three NVMM behaviour models: coarse-grain, fine-grain and DCPMM-based. The coarse and fine models enable NVMM performance evaluations based on extensions of traditional DRAM behaviour. The DCPMM-based model emulates the behaviour of a real DCPMM. Whole evaluation environment is also provided including Linux kernel modifications and several runtime functions. We first validate the developed emulator with an existing NVMM emulator, a cycle-accurate NVMM simulator and a real DCPMM. Then, the program behavior differences among three models are evaluated with SPEC CPU programs. As a result, the fine-grain model reveals the program execution time is affected by the frequency of NVMM memory requests rather than the cache hit ratio. Comparing with the fine-grain model and the coarse-grain model under the condition of the former's longer total write latency than the latter's, the former shows lower execution time for four of fourteen programs than the latter because of the bank-level parallelism and the row-buffer access locality exploited by the former model.
ER -