Reinforcement Learning (6) Recall V⋆,Q⋆,Vπ,QπV^\star , Q^\star , V^\pi, Q^\piV⋆,Q⋆,Vπ,Qπ . V⋆(s)=maxa∈A(R(s,a)+γES′∼P(⋅∣s,a)[V⋆(s′)]⏟Q⋆(s,a))V^\star (s)=\max _{a \in A}(\underbrace{R(s, a)+\gamma \mathbb{E}_{S^{\prime} \sim P( 2024-02-11 Course Notes > Reinforcement Learning
Reinforcement Learning (2-4) Markov Decision Processes Infinite-horizon discounted MDPs An MDP M=(S,A,P,R,γ)M = (S, A, P, R, \gamma)M=(S,A,P,R,γ) consists of: State space SSS . Action space AAA . Transition function PPP : S×A→Δ( 2024-02-07 Course Notes > Reinforcement Learning
Deep Learning for CV (X) l1 and l2 regulation softmax softmax(f1,⋯ ,fc)=(exp(f1)∑jexp(fj),⋯ ,exp(f1)∑jexp(fj))softmax(f_1, \cdots, f_c) = \left( \frac{\exp(f_1)}{\sum_j \exp(f_j)}, \cdots, \frac{\exp(f_1)}{\sum_j \exp(f_j 2024-02-06 Course Notes > Deep Learning for CV
Distributed System (6-7) linearization Liveness: guarantee that something good will happen eventually Safety: guarantee that something bad will never happen. Stable: once true, stays true forever afterwards Multicast Basic 2024-02-05 Course Notes > Distributed Systems
Reinforcement Learning (1) example: Shortest Path Shortest Path nodes: stats edges: actions Greedy is not optimal. Bellman Equation (Dynamic Programing): V⋆(d)=min{3+V⋆(g), 2+V⋆(f) V^\star (d) = \min\{3 + V^\star (g) ,\, 2 2024-02-04 Course Notes > Reinforcement Learning #CS443
Reinforcement Learning Homework (0) 1 Among all probability distributions over [a,b]∈R[a, b] \in \mathbb{R}[a,b]∈R , which distribution has the highest variance? How large is that variance? P(x)={12x=a,b0otherwiseP(x) = \begin{cases} 2024-02-03 Course Notes > Reinforcement Learning
Deep Learning for Computer Vision (2) The basic supervised learning framework y=f(x)y = f(x) y=f(x) yyy : output fff : prediction function xxx : input training: (x1,𝑦1),…,(xN,YN){(x_1,𝑦1), …, (x_N,Y_N)} (x1,y1),…,(xN,YN) Nearest ne 2024-02-01 Course Notes > Deep Learning for CV #CS444
Deep Learning for Computer Vision (1) CS 444: Deep Learning for Computer Vision Course Notes A Taxonomy of Learning Problems Learning problems are categorized by the type of output: classification regression structured prediction dense 2024-02-01 Course Notes > Deep Learning for CV #CS444
Communication Networks (5) Internet transport protocols services TCP service reliable transport flow control sender won’t overwhelm receiver congestion control throttle sender when network overloaded does not provide ti 2024-01-31 Course Notes > Communication Networks
Communication Networks (4) Application architectures Possible structure of applications: Client-server Peer-to-peer (P2P) Hybrid Client-server architecture server always-on host permanent IP address client may be intermi 2024-01-30 Course Notes > Communication Networks