Containerized Data Importer

Objective

This guide provides instructions on how to use the Containerized Data Importer (CDI) to manage preconfigured data volumes with a Kuberneters (K8s) cluster. Containerized Data Importer enables usage of preconfigured volumes for use as the foundation of a kubernetes virtual machine through kubevirt. This process allows for storing of base images within DataVolumes (DVs) or Persistent Volume Claims (PVCs), enabling a more efficient and streamlined use of k8s for virtualization through customized images.

Using the instructions provided in this guide, you can create volumes, import images into the volumes, validate the imports, and utilize them to launch virtual machines.


Prerequisites

The following prerequisites apply:

Note: If you do not have an account, see Create an Account.

  • A Kubernetes Cluster.

Example Manifests

Fedora PVC Import
          apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: "fedora-pvc"
  namespace: "cdi-example"
  labels:
    app: containerized-data-importer
  annotations:
    cdi.kubevirt.io/storage.import.endpoint: "http://mirrors.kernel.org/fedora/releases/36/Cloud/x86_64/images/Fedora-Cloud-Base-36-1.5.x86_64.raw.xz"
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

        
Fedora PVC VM
          apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  creationTimestamp: 2018-07-04T15:03:08Z
  generation: 1
  labels:
    kubevirt.io/os: linux
  name: "fedora-pvc-vm"
  namespace: "cdi-example"
spec:
  running: true
  template:
    metadata:
      creationTimestamp: null
      labels:
        kubevirt.io/domain: vm1
    spec:
      domain:
        cpu:
          cores: 2
        devices:
          disks:
          - disk:
              bus: virtio
            name: disk0
          - cdrom:
              bus: sata
              readonly: true
            name: cloudinitdisk
        machine:
          type: q35
        resources:
          requests:
            memory: 1024M
      volumes:
      - name: disk0
        persistentVolumeClaim:
          claimName: fedora-pvc
      - cloudInitNoCloud:
          userData: |
            #cloud-config
            hostname: fedora-pvc-vm
            ssh_pwauth: True
            disable_root: false
            ssh_authorized_keys:
            - 'Put your ssh key here!'
        name: cloudinitdisk

        
Cirrus DV Import
          # This example assumes you are using a default storage class
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: cirrus-dv
  namespace: cdi-example
spec:
  source:
      http:
         url: "https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img"
  pvc:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 500Mi

        
Cirrus DV VM
          apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  creationTimestamp: 2018-07-04T15:03:08Z
  generation: 1
  labels:
    kubevirt.io/os: linux
  name: "cirrus-dv-vm"
  namespace: "cdi-example"
spec:
  running: true
  template:
    metadata:
      creationTimestamp: null
      labels:
        kubevirt.io/domain: vm1
    spec:
      domain:
        cpu:
          cores: 2
        devices:
          disks:
          - disk:
              bus: virtio
            name: disk0
          - cdrom:
              bus: sata
              readonly: true
            name: cloudinitdisk
        machine:
          type: q35
        resources:
          requests:
            memory: 1024M
      volumes:
      - name: disk0
        persistentVolumeClaim:
          claimName: cirrus-dv
      - cloudInitNoCloud:
          userData: |
            #cloud-config
            hostname: cirrus-dv-vm
            ssh_pwauth: True
            disable_root: false
            ssh_authorized_keys:
            - 'Put your ssh key here!'
        name: cloudinitdisk

        
Cloned Cirrus DVs
          # This example assumes you are using a default storage class
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: cloned-cirrus-dv1
  namespace: cdi-example
spec:
  source:
    pvc:
      namespace: cdi-example
      name: cirrus-dv
  pvc:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 500Mi
---
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: cloned-cirrus-dv2
  namespace: cdi-example
spec:
  source:
    pvc:
      namespace: cdi-example
      name: cirrus-dv
  pvc:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 500Mi
---
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: cloned-cirrus-dv3
  namespace: cdi-example
spec:
  source:
    pvc:
      namespace: cdi-example
      name: cirrus-dv
  pvc:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 500Mi


        
Cloned Cirrus DV VMs
          apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  creationTimestamp: 2018-07-04T15:03:08Z
  generation: 1
  labels:
    kubevirt.io/os: linux
  name: "cloned-cirrus-dv-vm1"
  namespace: "cdi-example"
spec:
  running: true
  template:
    metadata:
      creationTimestamp: null
      labels:
        kubevirt.io/domain: cloned-cirrus-dv-vm1
    spec:
      domain:
        cpu:
          cores: 2
        devices:
          disks:
          - disk:
              bus: virtio
            name: disk0
          - cdrom:
              bus: sata
              readonly: true
            name: cloudinitdisk
        machine:
          type: q35
        resources:
          requests:
            memory: 1024M
      volumes:
      - name: disk0
        persistentVolumeClaim:
          claimName: cloned-cirrus-dv1
      - cloudInitNoCloud:
          userData: |
            #cloud-config
            hostname: cloned-cirrus-dv-vm1
            ssh_pwauth: True
            disable_root: false
            ssh_authorized_keys:
            - 'Put your ssh key here!'
        name: cloudinitdisk
---
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  creationTimestamp: 2018-07-04T15:03:08Z
  generation: 1
  labels:
    kubevirt.io/os: linux
  name: "cloned-cirrus-dv-vm2"
  namespace: "cdi-example"
spec:
  running: true
  template:
    metadata:
      creationTimestamp: null
      labels:
        kubevirt.io/domain: cloned-cirrus-dv-vm2
    spec:
      domain:
        cpu:
          cores: 2
        devices:
          disks:
          - disk:
              bus: virtio
            name: disk0
          - cdrom:
              bus: sata
              readonly: true
            name: cloudinitdisk
        machine:
          type: q35
        resources:
          requests:
            memory: 1024M
      volumes:
      - name: disk0
        persistentVolumeClaim:
          claimName: cloned-cirrus-dv2
      - cloudInitNoCloud:
          userData: |
            #cloud-config
            hostname: cloned-cirrus-dv-vm2
            ssh_pwauth: True
            disable_root: false
            ssh_authorized_keys:
            - 'Put your ssh key here!'
        name: cloudinitdisk
---
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  creationTimestamp: 2018-07-04T15:03:08Z
  generation: 1
  labels:
    kubevirt.io/os: linux
  name: "cloned-cirrus-dv-vm3"
  namespace: "cdi-example"
spec:
  running: true
  template:
    metadata:
      creationTimestamp: null
      labels:
        kubevirt.io/domain: cloned-cirrus-dv-vm3
    spec:
      domain:
        cpu:
          cores: 2
        devices:
          disks:
          - disk:
              bus: virtio
            name: disk0
          - cdrom:
              bus: sata
              readonly: true
            name: cloudinitdisk
        machine:
          type: q35
        resources:
          requests:
            memory: 1024M
      volumes:
      - name: disk0
        persistentVolumeClaim:
          claimName: cloned-cirrus-dv3
      - cloudInitNoCloud:
          userData: |
            #cloud-config
            hostname: cloned-cirrus-dv-vm3
            ssh_pwauth: True
            disable_root: false
            ssh_authorized_keys:
            - 'Put your ssh key here!'
        name: cloudinitdisk
        

Create Volumes

There are different ways to bring up an instance running with CDI. In general, a PVC or DV is used. You can also import or upload data to these storage layers. This example covers the import method of storage allocation.

Directly Import an Image into a PVC

Directly importing an image into a PVC does not offer the benefits that a DV offers, but it is a quick and easy method.

Step 1: Create a PVC.

Enter the following command. This example creates PVC with the importer-fedora-pvc manifest.

❯ kubectl create -f importer-fedora-pvc.yml

The following is an example output:

          persistentvolumeclaim/fedora-pvc created
        

At this stage, a container named importer-fedora-pvc is created. This will spawn a short-lived create-pvc container. After that, the importer process loads the remote image into the PVC.

Step 2: Verify the PVC.

Enter the following command to verify the PVC:

❯ kubectl get pvc -n cdi-example fedora-pvc -o yaml

The following is an example output:

          apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    cdi.kubevirt.io/storage.condition.running: "true"
    cdi.kubevirt.io/storage.condition.running.message: ""
    cdi.kubevirt.io/storage.condition.running.reason: Pod is running
    cdi.kubevirt.io/storage.import.endpoint: http://mirrors.kernel.org/fedora/releases/36/Cloud/x86_64/images/Fedora-Cloud-Base-36-1.5.x86_64.raw.xz
    cdi.kubevirt.io/storage.import.importPodName: importer-fedora-pvc
    cdi.kubevirt.io/storage.pod.phase: Running
    cdi.kubevirt.io/storage.pod.restarts: "0"
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: rancher.io/local-path
    volume.kubernetes.io/selected-node: mm-igw-95687
  creationTimestamp: "2022-10-21T01:59:42Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    app: containerized-data-importer
  name: fedora-pvc
  namespace: cdi-example
  resourceVersion: "458169"
  selfLink: /api/v1/namespaces/cdi-example/persistentvolumeclaims/fedora-pvc
  uid: 75fee4e9-3982-4898-ab7b-b8505a63e245
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: standard
  volumeMode: Filesystem
  volumeName: pvc-75fee4e9-3982-4898-ab7b-b8505a63e245
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 5Gi
  phase: Bound
        
Step 3: Validate import process of VM image.

Enter the following command:

❯ kubectl logs -n cdi-example -f importer-fedora-pvc

The following is an example output:

          I1021 01:59:53.814007       1 importer.go:104] Starting importer
I1021 01:59:53.814108       1 importer.go:171] begin import process
I1021 01:59:53.885795       1 data-processor.go:379] Calculating available size
I1021 01:59:53.885881       1 data-processor.go:391] Checking out file system volume size.
I1021 01:59:53.885924       1 data-processor.go:399] Request image size not empty.
I1021 01:59:53.885952       1 data-processor.go:404] Target size 5Gi.
I1021 01:59:53.981727       1 data-processor.go:282] New phase: TransferDataFile
I1021 01:59:53.982171       1 util.go:192] Writing data...
I1021 01:59:54.983083       1 prometheus.go:72] 0.02
I1021 01:59:55.983257       1 prometheus.go:72] 0.03
...
I1021 02:07:17.639233       1 prometheus.go:72] 100.00
I1021 02:07:18.076313       1 data-processor.go:282] New phase: Resize
W1021 02:07:18.088348       1 data-processor.go:361] Available space less than requested size, resizing image to available space 5073010688.
I1021 02:07:18.088392       1 data-processor.go:369] Calculated new size is < than current size, not resizing: requested size 5073010688, virtual size: 5368709120.
I1021 02:07:18.088423       1 data-processor.go:288] Validating image
I1021 02:07:18.099427       1 data-processor.go:282] New phase: Complete
I1021 02:07:18.099594       1 importer.go:215] Import Complete
        

Directly Import an Image into a DV

A DataVolume is a CDI abstraction that is wrapped on top of a PVC. A DataVolume offers full API integration/management, easier integration with kube-virt, cloning, etc. For more information on DataVolumes, see CDI DataVolumes.

Do the following to import an image into a DV:

Step 1: Create a DV.

Enter the following command to create a DV. This example Cirrus DV importer manifest.

❯ kubectl create -f importer-cirrus-dv.yml

The following is an example output:

          datavolume.cdi.kubevirt.io/cirrus-dv created
        
Step 2: Verify that the DV is created.

An importer container is launched, and it spawns a short-lived container before populating the DV. In case of Cirrus, this process is faster. Therefore, the importer container may complete before you can inspect the logs. However, you can check the status using the following command:

❯ kubectl get dv -n cdi-example cirrus-dv -o yaml

The following is an example output:

          apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  creationTimestamp: "2022-10-21T02:09:29Z"
  generation: 5
  name: cirrus-dv
  namespace: cdi-example
  resourceVersion: "461255"
  selfLink: /apis/cdi.kubevirt.io/v1beta1/namespaces/cdi-example/datavolumes/cirrus-dv
  uid: 8bbe63fa-e6c6-4a8f-a7b8-5c7f1a392cf4
spec:
  pvc:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 500Mi
  source:
    http:
      url: https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img
status:
  claimName: cirrus-dv
  conditions:
  - lastHeartbeatTime: "2022-10-21T02:09:37Z"
    lastTransitionTime: "2022-10-21T02:09:37Z"
    message: PVC cirrus-dv Bound
    reason: Bound
    status: "True"
    type: Bound
  - lastHeartbeatTime: "2022-10-21T02:10:41Z"
    lastTransitionTime: "2022-10-21T02:10:41Z"
    status: "True"
    type: Ready
  - lastHeartbeatTime: "2022-10-21T02:10:41Z"
    lastTransitionTime: "2022-10-21T02:09:29Z"
    message: Import Complete
    reason: Completed
    status: "False"
    type: Running
  phase: Succeeded
  progress: 100.0%
        
Step 3: Clone the DV.

One of the useful features of DVs is the ability to clone them, enabling you to efficiently spawn VMs. This example creates 3 clones that can be subsequently used to launch 3 VMs. In this example, the cloned-cirrus-dvs manifest is used.

Enter the following command:

❯ kubectl create -f cloned-cirrus-dvs.yml

The following is a sample output.

          datavolume.cdi.kubevirt.io/cloned-cirrus-dv1 created
datavolume.cdi.kubevirt.io/cloned-cirrus-dv2 created
datavolume.cdi.kubevirt.io/cloned-cirrus-dv3 created
        
Step 3: Verify the cloned DVs.

Several containers get spawned, but these will be 'upload' containers. Check the progress by verifying the DVs. Enter the following command:

❯ kubectl describe dv -n cdi-example

The following is an example output:

          ...
Status:
  Claim Name:  cloned-cirrus-dv1
  Conditions:
    Last Heartbeat Time:   2022-10-21T02:26:42Z
    Last Transition Time:  2022-10-21T02:26:42Z
    Message:               PVC cloned-cirrus-dv1 Bound
    Reason:                Bound
    Status:                True
    Type:                  Bound
    Last Heartbeat Time:   2022-10-21T02:27:29Z
    Last Transition Time:  2022-10-21T02:27:29Z
    Status:                True
    Type:                  Ready
    Last Heartbeat Time:   2022-10-21T02:27:29Z
    Last Transition Time:  2022-10-21T02:27:29Z
    Message:               Clone Complete
    Reason:                Completed
    Status:                False
    Type:                  Running
  Phase:                   Succeeded
  Progress:                100.0%
...
Status:
  Claim Name:  cloned-cirrus-dv2
  Conditions:
    Last Heartbeat Time:   2022-10-21T02:26:42Z
    Last Transition Time:  2022-10-21T02:26:42Z
    Message:               PVC cloned-cirrus-dv2 Bound
    Reason:                Bound
    Status:                True
    Type:                  Bound
    Last Heartbeat Time:   2022-10-21T02:27:29Z
    Last Transition Time:  2022-10-21T02:27:29Z
    Status:                True
    Type:                  Ready
    Last Heartbeat Time:   2022-10-21T02:27:29Z
    Last Transition Time:  2022-10-21T02:27:29Z
    Message:               Clone Complete
    Reason:                Completed
    Status:                False
    Type:                  Running
  Phase:                   Succeeded
  Progress:                100.0%
...
Status:
  Claim Name:  cloned-cirrus-dv3
  Conditions:
    Last Heartbeat Time:   2022-10-21T02:26:46Z
    Last Transition Time:  2022-10-21T02:26:46Z
    Message:               PVC cloned-cirrus-dv3 Bound
    Reason:                Bound
    Status:                True
    Type:                  Bound
    Last Heartbeat Time:   2022-10-21T02:27:29Z
    Last Transition Time:  2022-10-21T02:27:29Z
    Status:                True
    Type:                  Ready
    Last Heartbeat Time:   2022-10-21T02:27:29Z
    Last Transition Time:  2022-10-21T02:27:29Z
    Message:               Clone Complete
    Reason:                Completed
    Status:                False
    Type:                  Running
  Phase:                   Succeeded
  Progress:                100.0%
        

Launch VMs

After the images are loaded, you can start testing the launching of the VMs. This example launches numerous (five) VMs using the Fedora PVC and Cirrus DVs.

Note: Ensure to add your own SSH key to the example manifests.

Step 1: Launch Fedora PVC VM.

Enter the following command:

❯ kubectl create -f fedora-pvc-vm.yml

The following is an example output:

          virtualmachine.kubevirt.io/fedora-pvc-vm created
        
Step 2: Launch Cirrus DV VM.

Enter the following command:

❯ kubectl create -f cirrus-dv-vm.yml

The following is an example output:

          virtualmachine.kubevirt.io/cirrus-dv-vm created
        
Step 2: Launch Cirrus Cloned DV VMs.

Enter the following command:

❯ kubectl create -f cloned-cirrus-dv-vms.yml

The following is an example output:

          virtualmachine.kubevirt.io/cloned-cirrus-dv-vm1 created
virtualmachine.kubevirt.io/cloned-cirrus-dv-vm2 created
virtualmachine.kubevirt.io/cloned-cirrus-dv-vm3 created
        
Step 3: Verify the launched VMs.

Enter the following command:

kubectl get pods -n cdi-example

The following is an example output. You can verify that all five VMs have been created.

          NAME                                       READY   STATUS    RESTARTS   AGE
virt-launcher-cirrus-dv-vm-l95w2           1/1     Running   0          6m58s
virt-launcher-cloned-cirrus-dv-vm1-n4phn   1/1     Running   0          76s
virt-launcher-cloned-cirrus-dv-vm2-9lvwm   1/1     Running   0          76s
virt-launcher-cloned-cirrus-dv-vm3-qncxx   1/1     Running   0          76s
virt-launcher-fedora-pvc-vm-4l6fh          1/1     Running   0          9m22s
        
Step 4: Test the launched VMs.

Ensure that you can connect to the launched VMs and that they function as expected. Enter the following command to check Fedora VM:

Note: You will need either virtctl or the virt plugin for kubectl.

❯ kubectl virt console -n cdi-example fedora-pvc-vm

          Successfully connected to fedora-pvc-vm console. The escape sequence is ^]

fedora-pvc-vm login:
        

Similarly, connect to the remaining VMs and ensure that they are functioning as expected.


Concepts


API References