Zoom Logo

PhDOpen: Aleksander Mądry - Shared screen with speaker view
Szymon Toruńczyk
10:09
@Wojciech typowo jest egzamin zaliczeniowy (zdalny), pisemny, lub program zaliczeniowy
Wojciech Jabłoński
11:11
rozumiem że informacje o egzaminie/programie zaliczeniowym pojawią się później
Szymon Toruńczyk
11:29
tak
Grzegorz Bokota
18:52
już jest 50 osób
Piotr Wygocki
19:18
There will practical assignment. We will present it in the excercise session - probably tomorrow.
Karolina Drabent
20:54
what "w.p." means?
mkurtys
21:01
With probability
Karolina Drabent
21:06
thanks
Wojciech Jabłoński
28:58
how do you use unlabeled data for augumentation?
Wojciech Jabłoński
36:59
it sounds like robust training works like a form of regularization for a neural network
Grzegorz Fabiański
37:35
How does this explain the graph? (and change between samll / large training set)
mkurtys
40:43
Does similar graph (crossing) occurs with strong regularisation?
Tomasz Grzegorzek
55:45
how does robust training change the explainability of the model (e.g LIME / oters..) ?
Krzysztof Galias
59:09
how good are the non-robust features in the context of generalization?
Konstanty Subbotko
01:01:11
does model architecture impact choosing between robust/non-robust features?
Grzegorz Fabiański
01:01:13
How generative networks overcome this ML emphasize on non-robust features? We have networks generating human-feature faces, not ML-(non-rbust)-feature faces.
Wojciech Jabłoński
01:02:13
human intuition uses "non-robust" features
Michal
01:05:49
thank you!
Marek Cygan
01:36:35
yes
Mateusz Przyborowski
01:36:35
yes
Tomasz Pawlowski
01:36:36
yes
Maciej
01:48:38
its sign(yw) = y sign (w)
Andrzej Pacuk
01:55:51
On cell: """sgrad = smooth_grad(std_model, img_IN, targ_IN, normalization_function_IN,100, 0.3)""" I get error:AttributeError: 'ellipsis' object has no attribute 'clone'
MaciejSatkiewicz
01:56:15
You have to fill in the line with the ellipsis :)
MaciejSatkiewicz
01:56:34
Both lines
Piotr Wygocki
01:56:36
Your task is to fill in some line in smooth_grad function
Piotr Wygocki
01:56:49
*lines
Andrzej Pacuk
01:57:19
got it, just error confused me.
Przemek
01:57:46
Yeah, it is quite weird that `...` is a valid python code
Piotr Wygocki
01:58:24
Hint 1:https://pytorch.org/docs/stable/generated/torch.normal.htmlmight be usefull
MaciejSatkiewicz
01:59:28
And just in case, the std_model we use here is definied as:
MaciejSatkiewicz
01:59:30
std_model = torchvision.models.resnet18(pretrained=True).cuda()std_model.eval()pass
MaciejSatkiewicz
02:00:17
It’s a standard pre-trained ImageNet model and we’d like to see if we can make any sense of it’s gradients in the input space
Julia Bazinska
02:00:59
What do the colors in the visualization actaully mean?
Piotr Wygocki
02:01:19
In this visualisation they mean nothing
Piotr Wygocki
02:01:34
Thus, there are different types of visualisations
MaciejSatkiewicz
02:02:14
Fyi here’s the code of the visualisation function
Piotr Wygocki
02:02:18
but for "human" way of interpeting picturees you should mabe see some shapes
MaciejSatkiewicz
02:02:21
mt = torch.mean(t, dim=[2, 3], keepdim=True).expand_as(t)st = torch.std(t, dim=[2, 3], keepdim=True).expand_as(t)return torch.clamp((t - mt) / (3 * st) + 0.5, 0, 1)
MaciejSatkiewicz
02:02:59
We just take the gradient in the input space and we normalise and shift it to fit the [0,1] interval
Julia Bazinska
02:03:19
ok, so the [0,1] interval is mapped to the colors somehow?
Piotr Wygocki
02:03:34
Our pictures are also normalized
Piotr Wygocki
02:03:48
so it's conistent with functions for printing pictures
Julia Bazinska
02:04:08
Ok
mkurtys
02:04:15
One can see utils in colab, just go far left, chose files, and open utils.py by clicking twice
Julia Bazinska
02:04:43
Thanks
Piotr Wygocki
02:05:12
actually normalized here mean, that they are in [0,1]
Piotr Wygocki
02:05:20
*means
mkurtys
02:11:03
Robust model was trained on such examples, wasn’t it? :-)
MaciejSatkiewicz
02:14:40
Yes, the robust model was trained on adversarial examples, from the definition on robust training.
Przemek
02:21:51
How much slower is robust training compared to standard one?
MaciejSatkiewicz
02:22:27
It’s considerably slower as you need to calculate adversarial examples for every input
MaciejSatkiewicz
02:23:20
And actually that’s the only difference
Przemek
02:24:47
yeah, so it requires running this PGD on each example from the dataset, which requires model that is on gpu. I would imagine that it could be tricky to implement
MaciejSatkiewicz
02:25:52
You’ll actually implement it on the exercises and in the homework task :)
Przemek
02:26:05
Oh, cool :)
MaciejSatkiewicz
02:26:07
L2PGD function is the trickiest part
MaciejSatkiewicz
02:26:11
But it’s already done
MaciejSatkiewicz
02:28:17
I’m not sure how much you can optimise the adversarial examples calculation, but obviously the batch size plays the big part - the bigger the batches, the faster the robust training goes. But also you need to be cautious as you use a lot of CUDA memory for the L2PGD
MaciejSatkiewicz
02:28:45
So the batches cannot be too big
Przemek
02:30:21
Yeah, I was also thinking about the problems with using multiple processes when your data loading requires GPU, I encountered some problems with that before
Przemek
02:31:34
But I guess this is not the most important right now
MaciejSatkiewicz
02:32:38
Ok just a sec
MaciejSatkiewicz
02:33:30
The first one:
MaciejSatkiewicz
02:33:32
noise = torch.tensor(np.random.normal(0, stdev, im.shape), dtype=im.dtype)noised_im = im + noise # TODO
MaciejSatkiewicz
02:33:54
The second one:
MaciejSatkiewicz
02:33:55
loss = torch.mean(relevant_coordinate)
mkurtys
02:33:58
Why loss is not 1/relevant_coordinate ?
MaciejSatkiewicz
02:34:26
@Przemek torch.nn.DataParallel should help
MaciejSatkiewicz
02:36:01
@mkurtys actually that is a little confusing, we implement a loss, but later we perform not-targeted attack, aka we maximise the loss
mkurtys
02:36:40
Thank you
MaciejSatkiewicz
02:36:54
Ideally this should not be called a loss but that’s just consistent with the L2PGD interface which takes the “custom_loss” parameter
MaciejSatkiewicz
02:37:10
And it’s ok in the grand scheme of things
MaciejSatkiewicz
02:37:21
(The naming convention)
Przemek
02:38:47
Thanks for the link, I will have to look into this nn.DataParallel. I've encountered this before but haven't figured out how to use it yet :)
Piotr Wygocki
02:49:40
Small Hint check out previous train loop and just add adversarrial perturbation of the data
Piotr Wygocki
02:51:10
Also remember about "targeted" parameter of L2PGD
Piotr Wygocki
02:53:30
setting use_tqdm=False of L2PGD might strongly reduce the output
MaciejSatkiewicz
02:55:01
Also notice that L2PGD sets the model in the .eval() model for the sake of producing the adversarial. It’s redundant for the linear network but will be important in the homework
MaciejSatkiewicz
02:55:18
.eval() mode*
MaciejSatkiewicz
02:56:01
And it’s important generally ;) your network may have a dropout layer or smth
MaciejSatkiewicz
02:57:11
data = L2PGD(model, data, target, normalization,step_size=0.5, Nsteps=20,eps=1.25, targeted=False, use_tqdm=False)
mkurtys
02:57:28
So ideally would be to move model.train down to the L2PGD
MaciejSatkiewicz
02:59:06
Yeah it’s a little cleaner
MaciejSatkiewicz
02:59:24
def adv_eval_loop(model, loader, epoch="-", normalization=None):acc_meter = AverageMeter()iterator = tqdm(iter(loader), total=len(loader))model.eval()# TODO - fill the rest of codefor data, target in iterator:data, target = data.cuda(), target.cuda()data = L2PGD(model, data, target, normalization,step_size=0.5, Nsteps=20,eps=1.25, targeted=False, use_tqdm=False)## ENDval = utils.accuracy(model, data, target, normalization)acc_meter.update(val, data.shape[0])iterator.set_description(f"Epoch: {epoch}, ADV_TEST accuracy={acc_meter.avg:.2f}")iterator.refresh()
Przemek
03:03:34
Thanks