Analisa Perbandingan Akurasi dan Waktu Eksekusi, SVM Klasik Vs Quantum SVM (QSVM), Menggunakan Dataset Breast Cancer (Kanker Payudara)

Spread the love

SVM (Support Vector Machine) adalah algoritma populer yg biasa kita gunakan utk menyelesaikan permasalahan klasifikasi.
https://id.wikipedia.org/wiki/Support-vector_machine

Tutorial ini terdiri dari 3 bagian: Demonstrasi SVM klasik; Demonstrasi QSVM (Quantum-enhanced SVM); dan Analisa perbandingan akurasi SVM & QSVM utk dataset Breast Cancer

Tentang dataset kanker payudara, baca = https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html

Kode program tersedia di GitHub = https://github.com/keamanansiber/qiskit/tree/master/3QuantumMachineLearning

Bagian I: Demonstrasi SVM klasik (RBF kernel)

1) Buat jupyter notebook:

import matplotlib.pyplot as plt
import numpy as np
from qiskit import BasicAer
from qiskit.ml.datasets import ad_hoc_data, breast_cancer
from qiskit.aqua import aqua_globals, QuantumInstance
from qiskit.aqua.utils import split_dataset_to_data_and_labels, map_label_to_class_name
from qiskit.aqua.algorithms import SklearnSVM, QSVM
from qiskit.circuit.library import ZZFeatureMap

2) Kita siapkan dataset untuk training, testing dan prediction.

feature_dim = 2 # dimension of each data point
training_dataset_size = 20
testing_dataset_size = 10
sample_Total, training_input, test_input, class_labels = breast_cancer(
training_size=training_dataset_size,
test_size=testing_dataset_size,
n=feature_dim, plot_data=True
)
datapoints, class_to_label = split_dataset_to_data_and_labels(test_input)
label_to_class = {label:class_name for class_name, label in class_to_label.items()}
print(class_to_label, label_to_class)

3) Kita jalan algoritma nya.
Untuk pengujian, hasilnya mencakup detail dan success ratio.
Untuk prediksi, hasilnya termasuk predicted labels.

aqua_globals.seed = 30
result = SklearnSVM(training_input, test_input, datapoints[0]).run()
print(“kernel matrix during the training:”)
kernel_matrix = result[‘kernel_matrix_training’]
img = plt.imshow(np.asmatrix(kernel_matrix), interpolation=’nearest’, origin=’upper’, cmap=’bone_r’)
plt.show()
print(“testing success ratio: “, result[‘testing_accuracy’])
print(“detail result: “, result)
print(“preduction of datapoints:”)
print(“ground truth: {}”.format(map_label_to_class_name(datapoints[1], label_to_class)))
print(“predicted classes:”, result[‘predicted_classes’])
print(“detail tentang Training dataset: “, training_input)
print(“detail tentang Testing dataset: “, test_input)
print(“detail tentang Prediction dataset: “, datapoints)

4) Penjelasan isi data training testing dan result

### detail tentang Training dataset (training_dataset_size = 20):
{
‘A’: array([[0.50265482, 5.15221195], [4.96371639, 1.50796447],
[1.38230077, 4.46106157], [1.75929189, 1.13097336],
[1.13097336, 2.89026524], [6.03185789, 2.95309709],
[1.44513262, 4.58672527], [1.13097336, 5.02654825],
[1.82212374, 0.75398224], [3.83274304, 4.71238898],
[0. , 0.56548668], [6.22035345, 0. ],
[0.37699112, 0.56548668], [1.63362818, 5.34070751],
[0.75398224, 5.40353936], [0.75398224, 4.33539786],
[4.71238898, 1.63362818], [0.06283185, 3.20442451],
[2.82743339, 0.69115038], [5.0893801 , 5.96902604]]),
‘B’: array([[1.31946891, 0.50265482], [1.50796447, 0.69115038],
[4.08407045, 6.09468975], [2.51327412, 4.96371639],
[1.00530965, 0.87964594], [1.0681415 , 3.58141563],
[5.96902604, 5.34070751], [5.0893801 , 2.07345115],
[4.20973416, 1.63362818], [6.03185789, 4.58672527],
[2.76460154, 5.40353936], [4.83805269, 6.22035345],
[1.13097336, 3.89557489], [3.33008821, 5.78053048],
[1.50796447, 2.57610598], [4.96371639, 5.52920307],
[0. , 4.20973416], [2.45044227, 3.45575192],
[5.0893801 , 5.27787566], [6.22035345, 4.20973416]])
}

### detail tentang Testing dataset (testing_dataset_size = 10):
{
‘A’: array([[5.78053048, 3.51858377], [3.64424748, 4.33539786],
[0.31415927, 3.83274304], [1.57079633, 1.13097336],
[1.00530965, 1.25663706], [1.00530965, 5.71769863],
[1.63362818, 4.96371639], [5.71769863, 0.43982297],
[5.2150438 , 5.90619419], [2.26194671, 0.62831853]]),
‘B’: array([[3.20442451, 5.0893801 ], [5.15221195, 0.12566371],
[2.70176968, 2.57610598], [5.71769863, 5.52920307],
[4.1469023 , 3.58141563], [4.96371639, 2.07345115],
[1.57079633, 4.08407045], [3.26725636, 2.07345115],
[2.63893783, 3.01592895], [3.14159265, 5.02654825]])
}

### detail result: {

‘kernel_matrix_training’: array([[1.00000000e+00, 6.23311212e-08, 5.34869269e-01, …,
3.55816121e-02, 2.68050554e-05, 5.10647701e-08],
[6.23311212e-08, 1.00000000e+00, 2.09439620e-05, …,
6.37601889e-03, 8.13625109e-04, 1.18036313e-02],
[5.34869269e-01, 2.09439620e-05, 1.00000000e+00, …,
3.41029301e-01, 7.43002452e-04, 8.00887999e-06],
…,
[3.55816121e-02, 6.37601889e-03, 3.41029301e-01, …,
1.00000000e+00, 5.84561254e-03, 6.17173870e-04],
[2.68050554e-05, 8.13625109e-04, 7.43002452e-04, …,
5.84561254e-03, 1.00000000e+00, 2.98193254e-01],
[5.10647701e-08, 1.18036313e-02, 8.00887999e-06, …,
6.17173870e-04, 2.98193254e-01, 1.00000000e+00]]),

‘svm’: {
‘alphas’: array([[ 9.61749755], [ 8.96539753],
[48.1742236 ], [ 6.32785207],
[34.74962891], [ 9.09086662],
[ 9.59297394], [12.01444267],
[36.96179378], [72.60906954],
[45.15487865], [ 6.44582859],
[32.56345911], [ 9.13718517],
[22.02724311], [21.27065753],
[44.39751753], [ 1.49700966],
[30.37236435], [29.59905104], [ 5.63855282]]),
‘bias’: array([-0.05836517]),
‘support_vectors’: array([[1.38230077, 4.46106157],
[1.75929189, 1.13097336], [1.13097336, 2.89026524],
[6.03185789, 2.95309709], [1.82212374, 0.75398224],
[3.83274304, 4.71238898], [0.37699112, 0.56548668],
[0.75398224, 4.33539786], [4.71238898, 1.63362818],
[5.0893801 , 5.96902604], [1.50796447, 0.69115038],
[2.51327412, 4.96371639], [1.0681415 , 3.58141563],
[5.96902604, 5.34070751], [5.0893801 , 2.07345115],
[4.20973416, 1.63362818], [4.83805269, 6.22035345],
[1.13097336, 3.89557489], [1.50796447, 2.57610598],
[4.96371639, 5.52920307], [0. , 4.20973416]]),
‘yin’: array([[-1.],
[-1.], [-1.],
[-1.], [-1.],
[-1.], [-1.],
[-1.], [-1.],
[-1.], [ 1.],
[ 1.], [ 1.],
[ 1.], [ 1.],
[ 1.], [ 1.],
[ 1.], [ 1.],
[ 1.], [ 1.]])},

‘kernel_matrix_testing’: array([[4.04135873e-05, 1.78140751e-05, 1.65921174e-05, 8.25744128e-01,
8.66687043e-06, 7.35695984e-02, 5.83430869e-09, 2.33689774e-06,
9.56564716e-02, 3.91177839e-02, 1.99553189e-06, 1.69226704e-03,
1.50327580e-05, 1.86778333e-01, 2.77190974e-01, 4.92803754e-02,
1.66738792e-02, 1.88263723e-05, 6.96838240e-05, 9.49041722e-02,
4.36915658e-08], … ]),

‘test_success_ratio’: 0.8,

‘testing_accuracy’: 0.8,

‘kernel_matrix_prediction’: array([[4.04135873e-05, 1.78140751e-05, 1.65921174e-05, 8.25744128e-01,
8.66687043e-06, 7.35695984e-02, 5.83430869e-09, 2.33689774e-06,
9.56564716e-02, 3.91177839e-02, 1.99553189e-06, 1.69226704e-03,
1.50327580e-05, 1.86778333e-01, 2.77190974e-01, 4.92803754e-02,
1.66738792e-02, 1.88263723e-05, 6.96838240e-05, 9.49041722e-02,
4.36915658e-08], … ]),

‘predicted_labels’: array([1., 0., 1., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 1., 1., 1., 1., 1.]),

‘predicted_classes’: [‘B’, ‘A’, ‘B’, ‘A’, ‘A’, ‘A’, ‘A’, ‘A’, ‘A’, ‘A’, ‘B’, ‘A’, ‘B’, ‘A’, ‘B’, ‘B’, ‘B’, ‘B’, ‘B’, ‘B’]
}

 

Bagian II: Demonstrasi QSVM (Quantum-enhanced SVM)

1) buat new jupyter notebook (sama persis seperti I.1 di atas):

import matplotlib.pyplot as plt
import numpy as np
from qiskit import BasicAer
from qiskit.ml.datasets import ad_hoc_data, breast_cancer
from qiskit.aqua import aqua_globals, QuantumInstance
from qiskit.aqua.utils import split_dataset_to_data_and_labels, map_label_to_class_name
from qiskit.aqua.algorithms import SklearnSVM, QSVM
from qiskit.circuit.library import ZZFeatureMap

2) Siapkan dataset utk training dan testing (sama seperti I.2 di atas):

feature_dim = 2 # dimension of each data point
training_dataset_size = 20
testing_dataset_size = 10
random_seed = 10598
shots = 1024
sample_Total, training_input, test_input, class_labels = breast_cancer(
training_size=training_dataset_size,
test_size=testing_dataset_size,
n=feature_dim, plot_data=False
)
datapoints, class_to_label = split_dataset_to_data_and_labels(test_input)
print(class_to_label)

3) Sekarang kita buat SVM nya.
Buat instance svm dengan membuat instance kelas QSVM.
Buat instance feature map (diperlukan oleh instance svm) dengan membuat instance kelas ZZFeatureMap:

backend = BasicAer.get_backend(‘qasm_simulator’)
feature_map = ZZFeatureMap(feature_dim, reps=2)
svm = QSVM(feature_map, training_input, test_input, None)# the data for prediction can be fed later.
svm.random_seed = random_seed
quantum_instance = QuantumInstance(backend, shots=shots, seed_simulator=random_seed, seed_transpiler=random_seed)
result = svm.run(quantum_instance)

4) Cek hasilnya

print(“kernel matrix during the training:”)
kernel_matrix = result[‘kernel_matrix_training’]
img = plt.imshow(np.asmatrix(kernel_matrix),interpolation=’nearest’,origin=’upper’,cmap=’bone_r’)
plt.show()
print(“testing success ratio: “, result[‘testing_accuracy’])
print(“detail tentang Training dataset: “, training_input)
print(“detail tentang Testing dataset: “, test_input)
print(“detail result: “, result)

5) Dengan menggunakan instance svm yang sudah terlatih (hasil training classifier SVM) untuk memprediksi label untuk input data yang baru disediakan:

predicted_labels = svm.predict(datapoints[0])
predicted_classes = map_label_to_class_name(predicted_labels, svm.label_to_class)
print(“ground truth: {}”.format(datapoints[1]))
print(“preduction: {}”.format(predicted_labels))

 

III. Analisa perbandingan akurasi SVM & QSVM utk dataset Breast Cancer

berikut ini adalah tabel yang berisi akurasi program svm biasa (bagian I) dibandingkan dengan QSVM (bagian II), yang dijalankan 3x.
Akurasi dan waktu eksekusi nya akan dibandingkan, untuk menentukan siapa pemenangnya, algoritma yang lebih baik.

SVM (accuracy & execution time) =
0.85 — 0.28108978271484375 seconds —
0.85 — 0.27810001373291016 seconds —
0.85 — 0.27548813819885254 seconds —
https://github.com/keamanansiber/qiskit/blob/master/3QuantumMachineLearning/SVM_Klasik.ipynb

QSVM (Quantum-enhanced SVM) (accuracy & execution time) =
0.8 — 9.273517370223999 seconds —
0.8 — 9.521188259124756 seconds —
0.8 — 9.461729764938354 seconds —
https://github.com/keamanansiber/qiskit/blob/master/3QuantumMachineLearning/SVM_QuantumEnhanced.ipynb

KESIMPULAN =
Akurasi beda tipis, namun QSVM butuh waktu yang lebih lama dikarenakan QSVM jalan di simulator Qiskit AER.
Saya yakin kedepannya kalau real quantum device sudah tersedia di pasaran, waktu eksekusi utk algoritma QSVM yang berjalan di komputer kuantum pasti bisa mengalahkan performa SVM biasa.

 

sumber referensi =
https://github.com/Qiskit/qiskit-community-tutorials/blob/add1163d5caf7d69356b3e355820e9708be82e55/machine_learning/svm_classical.ipynb
https://github.com/Qiskit/qiskit-community-tutorials/blob/add1163d5caf7d69356b3e355820e9708be82e55/machine_learning/qsvm.ipynb

Tinggalkan Balasan

This site uses Akismet to reduce spam. Learn how your comment data is processed.