flayed blog

Reinforcement Learning (6)

Recall V⋆,Q⋆,Vπ,QπV^\star , Q^\star , V^\pi, Q^\piV⋆,Q⋆,Vπ,Qπ . V⋆(s)=max⁡a∈A(R(s,a)+γES′∼P(⋅∣s,a)[V⋆(s′)]⏟Q⋆(s,a))V^\star (s)=\max _{a \in A}(\underbrace{R(s, a)+\gamma \mathbb{E}_{S^{\prime} \sim P(

2024-02-11

Course Notes > Reinforcement Learning

Reinforcement Learning (2-4)

Markov Decision Processes Infinite-horizon discounted MDPs An MDP M=(S,A,P,R,γ)M = (S, A, P, R, \gamma)M=(S,A,P,R,γ) consists of: State space SSS . Action space AAA . Transition function PPP : S×A→Δ(

2024-02-07

Course Notes > Reinforcement Learning

Deep Learning for CV (X)

l1 and l2 regulation softmax softmax(f1,⋯ ,fc)=(exp⁡(f1)∑jexp⁡(fj),⋯ ,exp⁡(f1)∑jexp⁡(fj))softmax(f_1, \cdots, f_c) = \left( \frac{\exp(f_1)}{\sum_j \exp(f_j)}, \cdots, \frac{\exp(f_1)}{\sum_j \exp(f_j

2024-02-06

Course Notes > Deep Learning for CV

Distributed System (6-7)

Distributed System (6-7)

linearization Liveness: guarantee that something good will happen eventually Safety: guarantee that something bad will never happen. Stable: once true, stays true forever afterwards Multicast Basic

2024-02-05

Course Notes > Distributed Systems

Reinforcement Learning (1)

example: Shortest Path Shortest Path nodes: stats edges: actions Greedy is not optimal. Bellman Equation (Dynamic Programing): V⋆(d)=min⁡{3+V⋆(g), 2+V⋆(f) V^\star (d) = \min\{3 + V^\star (g) ,\, 2

2024-02-04

Course Notes > Reinforcement Learning

#CS443

Reinforcement Learning Homework (0)

1 Among all probability distributions over [a,b]∈R[a, b] \in \mathbb{R}[a,b]∈R , which distribution has the highest variance? How large is that variance? P(x)={12x=a,b0otherwiseP(x) = \begin{cases}

2024-02-03

Course Notes > Reinforcement Learning

Deep Learning for Computer Vision (2)

Deep Learning for Computer Vision (2)

The basic supervised learning framework y=f(x)y = f(x) y=f(x) yyy : output fff : prediction function xxx : input training: (x1,𝑦1),…,(xN,YN){(x_1,𝑦1), …, (x_N,Y_N)} (x1,y1),…,(xN,YN) Nearest ne

2024-02-01

Course Notes > Deep Learning for CV

#CS444

Deep Learning for Computer Vision (1)

Deep Learning for Computer Vision (1)

CS 444: Deep Learning for Computer Vision Course Notes A Taxonomy of Learning Problems Learning problems are categorized by the type of output: classification regression structured prediction dense

2024-02-01

Course Notes > Deep Learning for CV

#CS444

Communication Networks (5)

Communication Networks (5)

Internet transport protocols services TCP service reliable transport flow control sender won’t overwhelm receiver congestion control throttle sender when network overloaded does not provide ti

2024-01-31

Course Notes > Communication Networks

Communication Networks (4)

Communication Networks (4)

Application architectures Possible structure of applications: Client-server Peer-to-peer (P2P) Hybrid Client-server architecture server always-on host permanent IP address client may be intermi

2024-01-30

Course Notes > Communication Networks