sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

597
active users

#neurips2023

0 posts0 participants0 posts today

It seems a fair AI would treat you the same if you had a different race/gender/disability/etc., but how can we ever test counterfactual fairness? In #NeurIPS2023 w Victor Veitch, we show you sometimes can with simple, observed metrics like group parity! 🧵 arxiv.org/abs/2310.19691

arXiv.orgCausal Context Connects Counterfactual Fairness to Robust Prediction and Group FairnessCounterfactual fairness requires that a person would have been classified in the same way by an AI or other algorithmic system if they had a different protected class, such as a different race or gender. This is an intuitive standard, as reflected in the U.S. legal system, but its use is limited because counterfactuals cannot be directly observed in real-world data. On the other hand, group fairness metrics (e.g., demographic parity or equalized odds) are less intuitive but more readily observed. In this paper, we use $\textit{causal context}$ to bridge the gaps between counterfactual fairness, robust prediction, and group fairness. First, we motivate counterfactual fairness by showing that there is not necessarily a fundamental trade-off between fairness and accuracy because, under plausible conditions, the counterfactually fair predictor is in fact accuracy-optimal in an unbiased target distribution. Second, we develop a correspondence between the causal graph of the data-generating process and which, if any, group fairness metrics are equivalent to counterfactual fairness. Third, we show that in three common fairness contexts$\unicode{x2013}$measurement error, selection on label, and selection on predictors$\unicode{x2013}$counterfactual fairness is equivalent to demographic parity, equalized odds, and calibration, respectively. Counterfactual fairness can sometimes be tested by measuring relatively simple group fairness metrics.

I was an invited speaker at the Neurips conference in New Orleans in Dec 2023 for the NeuroAI social.

I was more than surprised to be invited to what is now primarily an AI/ML conference (despite "Neural" being the first word, and the conference's origins in comp neuroscience). To say that the successful AI systems currently deployed and neuroscience/study of biological intelligence have diverged would be an understatement, it was a somewhat odd choice for the organizers to invite a neurophysiologist like me.

So, I took the invite as an opportunity to talk about attention in biological vision and how whatever they now call as attention in AI/ML/CNN/transformers
is almost orthogonal to what many others and I study within visual neuroscience or psychology or cognitive science.

While the talk was a partial critique of current AI models, it was more a call for them to take seriously the one instance of intelligence (i.e., the biological world) seriously and how it still has much to offer towards designing better AI systems.

If attention is not one of the cognitive ingredients that makes up the intelligence recipe towards autonomous systems, I don't know what is.

The talk slides can be found here: dropbox.com/scl/fi/927f50bfvqp

DropboxNeuroAI_Neurips_KS2023.pdfShared with Dropbox

Ok, a confession about attending . I was there for the cutting-edge AI/ML innovation and science, sure. But I was *also* there for the food, and to see old friends. But *also*, really maybe the first thing I thought of?

Jazz.

Saw a fantastic concert at Preservation Hall, lots of great music on Frenchman St. And I went to the Jazz museum, really nice.

What I wasn't expecting? Jazz museum is in the old Mint. Which had this very cool old calculator.

Consistency models are like diffusion models, but based on ordinary rather than stochastic differential equations. Avoids the many steps, and maps input more deterministically to output. Great talk by Yang Song at . arxiv.org/abs/2303.01469

arXiv.orgConsistency ModelsDiffusion models have significantly advanced the fields of image, audio, and video generation, but they depend on an iterative sampling process that causes slow generation. To overcome this limitation, we propose consistency models, a new family of models that generate high quality samples by directly mapping noise to data. They support fast one-step generation by design, while still allowing multistep sampling to trade compute for sample quality. They also support zero-shot data editing, such as image inpainting, colorization, and super-resolution, without requiring explicit training on these tasks. Consistency models can be trained either by distilling pre-trained diffusion models, or as standalone generative models altogether. Through extensive experiments, we demonstrate that they outperform existing distillation techniques for diffusion models in one- and few-step sampling, achieving the new state-of-the-art FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 for one-step generation. When trained in isolation, consistency models become a new family of generative models that can outperform existing one-step, non-adversarial generative models on standard benchmarks such as CIFAR-10, ImageNet 64x64 and LSUN 256x256.

Models can't decide what to do with the update?
Let them vote.
Delete small (noise) parameters
Delete weights with sign disagreements
Sounds interesting?
What if we say model merging\fusion?
arxiv.org/abs/2306.01708

Come talk in the poster

Poster #1118 enjoy the walk...

arXiv.orgTIES-Merging: Resolving Interference When Merging ModelsTransfer learning - i.e., further fine-tuning a pre-trained model on a downstream task - can confer significant advantages, including improved downstream performance, faster convergence, and better sample efficiency. These advantages have led to a proliferation of task-specific fine-tuned models, which typically can only perform a single task and do not benefit from one another. Recently, model merging techniques have emerged as a solution to combine multiple task-specific models into a single multitask model without performing additional training. However, existing merging methods often ignore the interference between parameters of different models, resulting in large performance drops when merging multiple models. In this paper, we demonstrate that prior merging techniques inadvertently lose valuable information due to two major sources of interference: (a) interference due to redundant parameter values and (b) disagreement on the sign of a given parameter's values across models. To address this, we propose our method, TRIM, ELECT SIGN & MERGE (TIES-Merging), which introduces three novel steps when merging models: (1) resetting parameters that only changed a small amount during fine-tuning, (2) resolving sign conflicts, and (3) merging only the parameters that are in alignment with the final agreed-upon sign. We find that TIES-Merging outperforms several existing methods in diverse settings covering a range of modalities, domains, number of tasks, model sizes, architectures, and fine-tuning settings. We further analyze the impact of different types of interference on model parameters, and highlight the importance of resolving sign interference. Our code is available at https://github.com/prateeky2806/ties-merging