Home » Posts tagged 'Kernel'

Tag Archives: Kernel

ZRAM vs ZSWAP vs ZCACHE


Another interesting to “oprex” :

Summary of what to use when:

  1. ZRAM if you have no HDD/SSD swap partition.
  2. ZSWAP if you do have a HDD/SSD swap partition.
  3. ZCACHE: It does what ZSWAP does and ALSO compresses and speeds the filesystem page cache. (It is internally much more complicated and is not in the mainline kernel as it is still under development).

Summary of their implementations:

  1. ZRAM is a compressed RAM based swap device
  2. ZSWAP is a compressed Cache if you already have a swap.
  3. ZCache is a backend for a special type of Virtual RAM thingy (Transcendent memory) that can be used to cache filesystem pages or swap data.

Details:

  • ZRAM: Makes a swap device in the RAM. Pages sent here are compressed as they are stored. It has a higher priority than other swap devices: pages that are swapped out are preferentially sent to the zram device till it is full, only then are any other swap devices used.
    • Benefits: Independent of other (physical) swap devices. It can be used when there is no swap partition to expand the available memory.
    • Disadvantages: If other swap devices (HDD/SSD) are present they are not used optimally. As the zram device is an independent swap device, once it is full, any new pages that need to be swapped out are sent to next swap device directly, hence:
      1. There is a real chance of LRU (least recently used) inversion: It will be the most recently swapped data that goes to the slow disk, while inactive pages that were swapped out long ago will remain in the fast ZRAM
      2. The data sent to and read from the disk will consume a lot of bandwidth as it is uncompressed.
        • Status: Merged into the mainline kernel 3.14. Once enabled on a system, it requires some userspace configuration to set up the swap devices and use them.
  • ZSWAP: The frontswap system hooks attempts to swap out pages and uses zswap as write-back-cache for a HDD/SSD swap device: An attempt is made to compress the page and if it contains poorly compressible data it is directly written two the disk. If the data is compressed, it is stored in the pool of zswap memory. If pages are swapped out of memory when the total compressed pages in RAM exceeds a certain size, the Least Recently Used (LRU) compressed page is written to the disk as it is unlikely to be required soon.
    • Benefits: Very efficient use RAM and disk based swap. Minimizes Disk I/O by both reducing the number of writes and reads required (data is compressed and held in RAM) and by reducing the bandwidth of these I/O operaions as the data is in a compressed form.
    • Limitations: It is an enhancement of disk based swap systems and hence depends on a swap partition on the hard disk.
    • Status: Merged into the 3.11 mainline linux kernel.
  • ZCache: It is a backend for the Transcendent memory system. Transcendent memory provides a RAM-like memory that can only be accessed a page at a time by using put and get calls. This is unlike normal memory that can be accessed a byte at a time. The frontswap and cleancache systems hook attempts to swap and reclaim file-system page caches respectively and send them to the transcendent memory backends. When zcache is used as a backend, the data is compressed and stored in the RAM. When it fills up, compressed pages are evicted to the swap. (an alternate backend is RAMster which shares a pool of RAM across networked computers). Using only the frontswap frontend with the zcache backend works just like zswap. (In fact zswap is a simplified subset of zcache)
    • Benefits Provides compressed caching both for swap and for filesystem caches.
    • Status: Still not mainlined as it is very complicated and is being worked on.

The best resources I found were:

Linux Tuning : Memanfaatkan tmpfs


Setelah gw menggunakan fitur zram dalam host linux di notebook yang gw gunakan untuk bekerja sehari-hari dan menggunakan btrfs sebagai file system untuk data, kali ini gw mencoba untuk melakukan tuning performance sekali lagi. Kali ini gw memanfaatkan file system virtual, tmpfs yang secara default sudah disupport oleh kernel Linux.

Sebagaimana  kita tahu bahwa pada beberapa direktori seperti /tmp dan /var/tmp/ adalah direktori temporari yang digunakan oleh sistem Linux untuk menyimpan file dan direktori secara temporary. Pada saat Linux akan di-shutdown, maka direktori temporari tersebut akan di-clear dan dihapus termasuk isi-nya. Nah, berangkat dari pengertian itu, maka akan lebih baik jika direktori dan isi-nya tersebut ditempatkan di RAM. Akses RAM lebih cepat daripada akses ke disk.

Disclaimer : Gw hanya menyarankan untuk digunakan di notebook ataupun desktop. Tidak disarankan digunakan di server karena tuning kali ini belum menyertakan skrip untuk mem-flush / menuliskan isi direktori tersebut ke disk.

(more…)

btrfs-cleaner leaves CPU pinned to 100%


Tiga hari ini gw mengalami problem. Tiba-tiba notebook menjadi lambat, di tengah-tengah kerja. Pas gw check nmon, ternyata ada proses yang memakan hampir 100% CPU, yaitu btrfs-cleaner.

Pas gw cari-cari di internet, ketemu bahwa btrfs-cleaner seringkali terjadi apabila kita mengaktifkan autodefrag pada file system btrfs (Gw aktifin sih…pada saat booting :D)

Settingan /etc/fstab :

# /etc/fstab: static file system information.
#
# Use ‘blkid’ to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/sda1 during installation
UUID=20a3dd42-fbf7-486b-8db2-0f77c1c58b5c /               ext3    errors=remount-ro 0       1
# /home was on /dev/sda6 during installation
UUID=3d9724c3-3157-43dd-b7de-db73b4212f6e /home           ext3    defaults        0       2
# swap was on /dev/sda5 during installation
#UUID=d633756f-cbbf-4963-b09a-c1d599743f4f none            swap    sw              0       0
UUID=9858bb28-275f-458f-83f8-869b22b33aa8 /backup         btrfs  noatime,nodiratime,autodefrag,noacl,compress-force=lzo         0       0
UUID=03d8854b-a9bb-4dc9-b994-e993aace5ae0 /data           btrfs  noatime,nodiratime,autodefrag,noacl,compress-force=zlib         0       0

Solusinya adalah menonaktifkan autodefrag di /etc/fstab. Opsi auto defrag tidak disarankan apabila filesystem termasuk high I/O 😦

File /etc/fstab setelah dinonaktifkan autodefrag-nya :

# /etc/fstab: static file system information.
#
# Use ‘blkid’ to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/sda1 during installation
UUID=20a3dd42-fbf7-486b-8db2-0f77c1c58b5c /               ext3    errors=remount-ro 0       1
# /home was on /dev/sda6 during installation
UUID=3d9724c3-3157-43dd-b7de-db73b4212f6e /home           ext3    defaults        0       2
# swap was on /dev/sda5 during installation
#UUID=d633756f-cbbf-4963-b09a-c1d599743f4f none            swap    sw              0       0
UUID=9858bb28-275f-458f-83f8-869b22b33aa8 /backup         btrfs  noatime,nodiratime,noacl,compress-force=lzo         0       0
UUID=03d8854b-a9bb-4dc9-b994-e993aace5ae0 /data           btrfs  noatime,nodiratime,noacl,compress-force=zlib         0       0

 

Different between ZRAM and ZSWAP


First answer :

ZRAM is a module of the Linux kernel, previously called “compcache”. ZRAM increases performance by avoiding paging on disk and instead uses a compressed block device in RAM in which paging takes place until it is necessary to use the swap space on the hard disk drive. Since using RAM is faster than using disks, zram allows Linux to make more use of RAM when swapping/paging is required, especially on older computers with less RAM installed.

ZSWAP is a lightweight compressed cache for swap pages. It takes pages that are in the process of being swapped out and attempts to compress them into a dynamically allocated RAM-based memory pool. zswap basically trades CPU cycles for potentially reduced swap I/O. This trade-off can also result in a significant performance improvement if reads from the compressed cache are faster than reads from a swap device.

Second answer :

zram

  • Status: In staging tree (as of 3.7) and looking to move into mainline
  • Implementation: compressed block device, memory is dynamically allocated as data is stored
  • Usage: Configure zram block device as a swap device to eliminate need for physical swap defice or swap file
  • Benefits:
    1. Eliminates need for physical swap device. This beame popular when netbooks first showed up. Zram (then compcache) allowed users to avoid swap shortening the lifespan of SSDs in these memory constrained systems.
    2. A zram block device can be used for other applications other than swap, anything you might use a block device for conceivably.
  • Drawbacks:
    1. Once a page is stored in zram it will remain there until paged in or invalidated. The first pages to be paged out will be the oldest pages (LRU list), these are ‘cold’ pages that are infrequently access. As the system continues to swap it will move on to pages that are warmer (more frequently accessed), these may not be able to be stored because of the swap slots consumed by the cold pages. What zram can not do (compcache had the option to configure a block backing device) is to evict pages out to physical disk. Ideally you want to age data out of the in-kernel compressed swap space out to disk so that you can use kernel memory for caching warm swap pages or free it for more productive use.

zswap

  • Status: Posted to LKML on Dec 11th, 2012
  • Implementation: compressed in-kernel cache for swap pages. In-kernel cache is compressed, the compression algorithm is pluggable using the CryptoAPI and the storage for pages is dynamically allocated. Older pages can be evicted to disk making this a sort of write-behind cache.
  • Usage: Cache swap pages destined for regular swap devices (or swap files).
  • Benefits:
    1. Integration with swap code (using Frontswap API) allows zswap to choose to store only pages that compress well and handle memory allocation failures, in those cases pages are sent to the backing swap device.
    2. Oldest pages in the cache are pushed out to backing swap device to make room for newer pages, this solves the LRU inversion problem that a lack of page eviction would present.
  • Drawbacks:
    1. Needs a physical swap device (or swapfile).

Add RAM to Ubuntu 13.10+ for free: zRAM


In the mainline generic kernel of Ubuntu, there’s a module called zram. This is a pretty good trick to add additional “free” RAM to your machine without any change: it creates in-memory compressed block for swap, meaning it eats a bit of your CPU but gives you literally more RAM.

If you’re on a VPS for example, having 512 MB RAM, this would actually give you access to 750 MB RAM and would eat just a little CPU from you – I don’t even notice it on the Munin graphs.

To install:

apt-get install zram-config

Make sure it’s started and running:

cat /proc/swaps

If you see something like this

# cat /proc/swaps 
Filename                                Type            Size    Used    Priority
/dev/zram0                              partition       62712   6804    5
/dev/zram1                              partition       62712   6768    5
/dev/zram2                              partition       62712   6744    5
/dev/zram3                              partition       62712   6768    5

then it’s already running.

Reboot your machine, and voilá. You might even turn your regulat disk-swap off.

SAP Kernel 6.40_EX2 on Oracle 11g problem solved !!


Akhirnya ketemu juga solusinya…

Problemnya muncul saat ane upgrade ke Oracle 11g (dari Oracle 10g). Sebenarnya masalah sudah muncul sejak upgrade dari Oracle 9i. Work process (disp+work) sering tiba-tiba mati sendiri. Padahal saat itu udah pake SAP Kernel 6.40_EX2 (the latest one). Versi dw (disp+work) untuk work process nya adalah keluaran tanggal 5 maret 2011.

Disini terlihat log dev_w0 :

====================

M Sat Mar 26 00:36:16 2011
M  in_ThErrHandle: 1
M  ThIErrHandle: new stat of W0 is WP_SHUTDOWN
M  ThIErrHandle: I’m during shutdown
M  PfStatDisconnect: disconnect statistics
M  Entering ThSetStatError
B  db_sqlbreak() = 1
M  ThIErrHandle: don’t try rollback again
M  ThShutDownServer: shutdown server
M  ThExecShutDown: perform exclusive shutdown actions
M  ThCheckComOrRb (event=1, full_commit=1)
M  ThCallHooks: call hook >ASTAT-collect commit handling< for event BEFORE_COMMIT
M  ThCallHooks: call hook >rsts_before_commit< for event BEFORE_COMMIT
M  SosSearchAnchor: search anchor for 10
M  SosSearchAnchor: invalid tid/mode T-1/M255
M  ThCheckComOrRb (event=3, full_commit=1)
M  ThCallHooks: call hook >ThVBICmRbHook< for event AFTER_COMMIT
M  ThVBICmRbHook: called for commit
M  ThCallHooks: call hook >ThNoClearPrevErr< for event AFTER_COMMIT
M  ThNoClearPrevErr: clear prev no err
M  ThCallHooks: call hook >rsts_after_commit< for event AFTER_COMMIT
M  SosSearchAnchor: search anchor for 10
M  SosSearchAnchor: invalid tid/mode T-1/M255
M  ThCallHooks: call hook >dyKeyTableRest< for event AFTER_COMMIT
M  ThCallHooks: call hook >SpoolHandleHook< for event AFTER_COMMIT
M  SosSearchAnchor: search anchor for 2
M  SosSearchAnchor: invalid tid/mode T-1/M255
M  ThUsrDelEntry (*, *, emaprd2_WPR_10      ) o.k.
M  ThICommit3: full commit, set time, keep resources, redispatch
M  ThICommit3: commit and keep resources
M  ThCheckComOrRb (event=1, full_commit=0)
M  ThCallHooks: call hook >ASTAT-collect commit handling< for event BEFORE_COMMIT
M  ThCallHooks: call hook >rsts_before_commit< for event BEFORE_COMMIT
M  SosSearchAnchor: search anchor for 10
M  SosSearchAnchor: invalid tid/mode T-1/M255
M  ThCheckComOrRb (event=3, full_commit=0)
M  ThCallHooks: call hook >ThVBICmRbHook< for event AFTER_COMMIT
M  ThVBICmRbHook: called for commit
M  ThCallHooks: call hook >ThNoClearPrevErr< for event AFTER_COMMIT
M  ThNoClearPrevErr: clear prev no err
M  ThCallHooks: call hook >rsts_after_commit< for event AFTER_COMMIT
M  SosSearchAnchor: search anchor for 10
M  SosSearchAnchor: invalid tid/mode T-1/M255
M  ThCallHooks: call hook >dyKeyTableRest< for event AFTER_COMMIT
M  ThCallHooks: call hook >SpoolHandleHook< for event AFTER_COMMIT
M  SosSearchAnchor: search anchor for 2
M  SosSearchAnchor: invalid tid/mode T-1/M255
M  ThAlarm: set alarm to 600 sec
M  ThICommit3 o.k.
M  ThExecShutDown: ThUsrDelEntry o.k.
M  ThExecShutDown: called rsau_log_system_stop
M  PfStatIndInit: Initializing Index-Record
M  PfWriteIntoFile: copied shared buf (0 bytes) to local buf
M  PfTimeFactor: new value: 25200
M  Entering ThReadDetachMode
M  call ThrShutDown (1)…
B  Disconnecting from ALL connections:
B  Wp  Hdl ConName                        ConId     ConState     TX  HC  PRM RCT FRC TIM MAX OPT Date     Time   DBHost
B  000 000 R/3                            000000000 INACTIVE     NO  NO  YES NO  NO  000 255 255 20110326 002910 emaprd2
C  Disconnecting from connection 0 …
C  Close user session (con_hdl=0,svchp=113176338,usrhp=1131b6f80)
C  Detaching from DB Server (con_hdl=0,svchp=113176338,srvhp=1131773e0)
C  Now connection 0 is disconnected
B  Disconnected from connection 0
B  statistics db_con_commit (com_total=107, com_tx=58)
B  statistics db_con_rollback (roll_total=1, roll_tx=0)
M  ***LOG Q02=> wp_halt, WPStop (Workproc 0 336276) [dpuxtool.c   318]

====================

Ane udah sempet nanya ke forum sdn.sap.com dan ke temen konsultan SAP Basis. Blom ada jawaban. Trus kemaren pas liat ke SAP Service Marketplace (http://service.sap.com), ternyata keluar update-an untuk dw (disp+work) per tanggal 25 Maret 2011. Coba download dan update dw-nya. Lalu coba jalankan proses kompresi seperti biasa. Ambil yang tabel yang besar-besar.

Eh, ternyata so far lancar. Proses dw bisa berjalan dengan lancar.

Microsoft Patches Linux; Linus Responds


Microsoft secara mengejutkan memberikan patch untuk kernel Linux (http://www.microsoft.com/presspass/features/2009/Jul09/07-20LinuxQA.mspx). Banyak pihak terutama yang fanatik terhadap gerakan open source menyatakan menolak patch-patch yang diberikan oleh Microsoft. Lalu bagaimana tanggapan Linus Torvalds (sebagai komandan utama yang bisa menyatakan menerima atau menolak sebuah patch ke kernel Linux) ? Seperti yang kita tahu, meskipun semua pihak bisa men-submit patch ke kernel Linux, namun keputusan terakhir tetap di tangan Linus Torvalds.

Berikut jawaban asli dari Linus Torvalds seperti dikutip Linux Magazine :

“I haven’t. Mainly because I’m not personally all that interested in driver code (it doesn’t affect anything else), especially when I wouldn’t use it myself.

So for things like that, I just trust the maintainers. I tend to look at code when bugs happen, or when it crosses multiple subsystems, or when it’s one of the core subsystems that I’m actively involved in (ie things like VM, core device resource handling, basic kernel code etc).

I’ll likely look at it when the code is actually submitted to me by the maintainers (Greg [Kroah-Hartman], in this case), just out of morbid curiosity.”

Secara umum sebenarnya Linus tidak serta merta menolak atau menerima. Hal ini mungkin perlu kita cermati pada release kernel Linux ke depan. Apakah akan diterima atau tidak ? Keputusan tetap di tangan Linus Torvalds.