Published by by Leo Wei

Before reading the rest of this blog, make sure you have the Federated Learning (Data Scientist) open side by side.

Pysyft Duet (Data Owner)

As a data owner, you want someone else to perform data science on data that you own, and you also want to protect this data by not give them the entirety of your data.
To do this, we can load our data into our local duet server.
To begin the process, you must launch a Duet session and help your Duet partner (data scientist) connect to this server.

Duet Basics

Make sure that the network_url you use is chosen from https://raw.githubusercontent.com/OpenMined/OpenGridNodes/master/network_address

Step 1. Initiate Duet Connection

import syft as sy
duet = sy.launch_duet(network_url="http://ec2-18-218-7-180.us-east-2.compute.amazonaws.com:5000")
🎤  🎸  ♪♪♪ Starting Duet ♫♫♫  🎻  🎹

♫♫♫ > DISCLAIMER: Duet is an experimental feature currently in beta.
♫♫♫ > Use at your own risk.


    > ❤️ Love Duet? Please consider supporting our community!
    > https://github.com/sponsors/OpenMined

♫♫♫ > Punching through firewall to OpenGrid Network Node at:
♫♫♫ > http://ec2-18-218-7-180.us-east-2.compute.amazonaws.com:5000
♫♫♫ >
♫♫♫ > ...waiting for response from OpenGrid Network... 
♫♫♫ > DONE!
♫♫♫ > Duet Server ID: 00eec93acc58f144d78a365705d42223

♫♫♫ > STEP 1: Send the following code to your Duet Partner!

import syft as sy
duet = sy.duet("00eec93acc58f144d78a365705d42223")

♫♫♫ > STEP 2: Ask your partner for their Client ID and enter it below!
♫♫♫ > Duet Partner's Client ID: 3406364123dfde0c6e7394b2167a9ef9

♫♫♫ > Connecting...
/opt/anaconda3/envs/duet/lib/python3.9/site-packages/aiortc/rtcdtlstransport.py:211: CryptographyDeprecationWarning: This version of cryptography contains a temporary pyOpenSSL fallback path. Upgrade pyOpenSSL now.
  _openssl_assert(lib.SSL_CTX_use_certificate(ctx, self._cert._x509) == 1)  # type: ignore
/opt/anaconda3/envs/duet/lib/python3.9/site-packages/aiortc/rtcdtlstransport.py:186: CryptographyDeprecationWarning: This version of cryptography contains a temporary pyOpenSSL fallback path. Upgrade pyOpenSSL now.
  value=certificate_digest(self._cert._x509),  # type: ignore
♫♫♫ > CONNECTED!

♫♫♫ > DUET LIVE STATUS  *  Objects: 18  Requests: 0   Messages: 159124  Request Handlers: 1                                                         

Step 2. Go to Data Scientist Notebook

After we have established connection between the data owner and the data scientist. Let's upload some data to the Duet server

Step 3. Create Data and Upload to Duet Server

import torch as th
grade_data = th.tensor([98, 78, 83, 88, 67, 73])
grade_data = grade_data.tag("grades")
grade_data = grade_data.describe("This is a list of the grades of 6 people")
# server, note that the data is still on the owner's machine and cannot be viewed or access 
# without the permission from the data owner
grade_data_pointer = grade_data.send(duet, pointable = True)
[2022-06-19T16:37:24.257117-0600][CRITICAL][logger]][15016] You do not have permission to .get() Object with ID: <UID: 0e6d0efcb06441db9d958e2006f5cbc8>Please submit a request.
[2022-06-19T16:37:24.258016-0600][CRITICAL][logger]][15016] You do not have permission to .get() Object with ID: <UID: 0e6d0efcb06441db9d958e2006f5cbc8>Please submit a request.

Step 4. Go to Data Scientist Notebook

Step 5. Check for Requests from Data Scientist

duet.requests.pandas
Requested Object's tags Reason Request ID Requested Object's ID Requested Object's type
0 [grades, float, mean] please, I need it <UID: 16cb1d6c39024a87acf1a29dd1f3d9d7> <UID: 0e6d0efcb06441db9d958e2006f5cbc8>
duet.requests[0].deny()
[2022-06-19T16:37:35.873523-0600][CRITICAL][logger]][15016] You do not have permission to .get() Object with ID: <UID: 0e6d0efcb06441db9d958e2006f5cbc8>Please submit a request.
[2022-06-19T16:37:35.874204-0600][CRITICAL][logger]][15016] You do not have permission to .get() Object with ID: <UID: 0e6d0efcb06441db9d958e2006f5cbc8>Please submit a request.

Step 6. Go to Data Scientist Notebook

Step 7. Request Handling

duet.requests.pandas
Requested Object's tags Reason Request ID Requested Object's ID Requested Object's type
0 [grades, float, mean] I am a data scientist and I need to know the a... <UID: 1faa4d8b428748f3840271919a750a5b> <UID: 0e6d0efcb06441db9d958e2006f5cbc8>
duet.requests[0].request_description
"I am a data scientist and I need to know the average of the students' grades for my analysis"
duet.requests[0].accept()

Step 8. Go to Data Scientist Notebook

Step 9. Getting MNIST Data and Make Add Request Handlers

MNIST with Duet

Part 1: Launch a Duet Server and Connect (Done above)

Part 2: Get data

from syft.util import get_root_data_path
import torchvision
torchvision.datasets.MNIST(get_root_data_path(), train=True, 
                           download=True, 
                           transform = torchvision.transforms.Compose([torchvision.transforms.ToTensor(), 
                                                                       torchvision.transforms.Normalize((0.1307,), (0.3081,))]))
torchvision.datasets.MNIST(get_root_data_path(), train=False, 
                           download=True, 
                           transform = torchvision.transforms.Compose([torchvision.transforms.ToTensor(), 
                                                                       torchvision.transforms.Normalize((0.1307,), (0.3081,))]))
Dataset MNIST
    Number of datapoints: 10000
    Root location: /Users/leowei/.syft/data
    Split: Test
    StandardTransform
Transform: Compose(
               ToTensor()
               Normalize(mean=(0.1307,), std=(0.3081,))
           )

Part 2: Add Request Handlers

duet.requests.pandas
duet.store.pandas
# duet.requests.add_handler(action = "deny")
ID Tags Description object_type
0 <UID: 2550b41ead684a24ac87a5dced4c5c6d> [grades] This is a list of the grades of 6 people <class 'torch.Tensor'>
duet.requests.add_handler(action="accept")
/opt/anaconda3/envs/duet/lib/python3.9/site-packages/syft/lib/torch/uppercase_tensor.py:30: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations.
  grad = getattr(obj, "grad", None)

We have done everything on the data owner's side, the rest is continued on the data scientist's notebook
Step 10. Go to Data Scientist Notebook