AWS Machine Learning Blog

NVIDIA NIM microservices now integrate with Amazon SageMaker, allowing you to deploy industry-leading large language models (LLMs) and optimize model performance and cost. You can deploy state-of-the-art LLMs in minutes instead of days using technologies such as NVIDIA TensorRT, NVIDIA TensorRT-LLM, and NVIDIA Triton Inference Server on NVIDIA accelerated instances hosted by SageMaker.
NIM, part of the NVIDIA AI Enterprise software platform listed on AWS marketplace, is a set of inference microservices that bring the power of state-of-the-art LLMs to your applications, providing natural language processing (NLP) and understanding capabilities, whether you’re developing chatbots, summarizing documents, or implementing other NLP-powered applications. You can use pre-built NVIDIA containers to host popular LLMs that are optimized for specific NVIDIA GPUs for quick deployment or use NIM tools to create your own containers.
In this post, we provide a high-level introduction to NIM and show how you can use it with SageMaker.
An introduction to NVIDIA NIM
NIM provides optimized and pre-generated engines for a variety of popular models for inference. These microservices support a variety of LLMs, such as Llama 2 (7B, 13B, and 70B), Mistral-7B-Instruct, Mixtral-8x7B, NVIDIA Nemotron-3 22B Persona, and Code Llama 70B, out of the box using pre-built NVIDIA TensorRT engines tailored for specific NVIDIA GPUs for maximum performance and utilization. These models are curated with the optimal hyperparameters for model-hosting performance for deploying applications with ease.
If your model is not in NVIDIA’s set of curated models, NIM offers essential utilities such as the Model Repo Generator, which facilitates the creation of a TensorRT-LLM-accelerated engine and a NIM-format model directory through a straightforward YAML file. Furthermore, an integrated community backend of vLLM provides support for cutting-edge models and emerging features that may not have been seamlessly integrated into the TensorRT-LLM-optimized stack.
In addition to creating optimized LLMs for inference, NIM provides advanced hosting technologies such as optimized scheduling techniques like in-flight batching, which can break down the overall text generation process for an LLM into multiple iterations on the model. With in-flight batching, rather than waiting for the whole batch to finish before moving on to the next set of requests, the NIM runtime immediately evicts finished sequences from the batch. The runtime then begins running new requests while other requests are still in flight, making the best use of your compute instances and GPUs.
Deploying NIM on SageMaker
NIM integrates with SageMaker, allowing you to host your LLMs with performance and cost optimization while benefiting from the capabilities of SageMaker. When you use NIM on SageMaker, you can use capabilities such as scaling out the number of instances to host your model, performing blue/green deployments, and evaluating workloads using shadow testing—all with best-in-class observability and monitoring with Amazon CloudWatch.
Conclusion
Using NIM to deploy optimized LLMs can be a great option for both performance and cost. It also helps make deploying LLMs effortless. In the future, NIM will also allow for Parameter-Efficient Fine-Tuning (PEFT) customization methods like LoRA and P-tuning. NIM also plans to have LLM support by supporting Triton Inference Server, TensorRT-LLM, and vLLM backends.
We encourage you to learn more about NVIDIA microservices and how to deploy your LLMs using SageMaker and try out the benefits available to you. NIM is available as a paid offering as part of the NVIDIA AI Enterprise software subscription available on AWS Marketplace.
In the near future, we will post an in-depth guide for NIM on SageMaker.

About the authors
James Park is a Solutions Architect at Amazon Web Services. He works with Amazon.com to design, build, and deploy technology solutions on AWS, and has a particular interest in AI and machine learning. In h is spare time he enjoys seeking out new cultures, new experiences,  and staying up to date with the latest technology trends.You can find him on LinkedIn.
Saurabh Trikande is a Senior Product Manager for Amazon SageMaker Inference. He is passionate about working with customers and is motivated by the goal of democratizing machine learning. He focuses on core challenges related to deploying complex ML applications, multi-tenant ML models, cost optimizations, and making deployment of deep learning models more accessible. In his spare time, Saurabh enjoys hiking, learning about innovative technologies, following TechCrunch, and spending time with his family.
Qing Lan is a Software Development Engineer in AWS. He has been working on several challenging products in Amazon, including high performance ML inference solutions and high performance logging system. Qing’s team successfully launched the first Billion-parameter model in Amazon Advertising with very low latency required. Qing has in-depth knowledge on the infrastructure optimization and Deep Learning acceleration.
Nikhil Kulkarni is a software developer with AWS Machine Learning, focusing on making machine learning workloads more performant on the cloud, and is a co-creator of AWS Deep Learning Containers for training and inference. He’s passionate about distributed Deep Learning Systems. Outside of work, he enjoys reading books, fiddling with the guitar, and making pizza.
Harish Tummalacherla is Software Engineer with Deep Learning Performance team at SageMaker. He works on performance engineering for serving large language models efficiently on SageMaker. In his spare time, he enjoys running, cycling and ski mountaineering.
Eliuth Triana Isaza is a Developer Relations Manager at NVIDIA empowering Amazon’s AI MLOps, DevOps, Scientists and AWS technical experts to master the NVIDIA computing stack for accelerating and optimizing Generative AI Foundation models spanning from data curation, GPU training, model inference and production deployment on AWS GPU instances. In addition, Eliuth is a passionate mountain biker, skier, tennis and poker player.
Jiahong Liu is a Solution Architect on the Cloud Service Provider team at NVIDIA. He assists clients in adopting machine learning and AI solutions that leverage NVIDIA accelerated computing to address their training and inference challenges. In his leisure time, he enjoys origami, DIY projects, and playing basketball.
Kshitiz Gupta is a Solutions Architect at NVIDIA. He enjoys educating cloud customers about the GPU AI technologies NVIDIA has to offer and assisting them with accelerating their machine learning and deep learning applications. Outside of work, he enjoys running, hiking and wildlife watching.
Go to Source
19/03/2024 – 00:02 /James Park
Twitter: @hoffeldtcom

MIT News – Artificial intelligence

Imagine yourself glancing at a busy street for a few moments, then trying to sketch the scene you saw from memory. Most people could draw the rough positions of the major objects like cars, people, and crosswalks, but almost no one can draw every detail with pixel-perfect accuracy. The same is true for most modern computer vision algorithms: They are fantastic at capturing high-level details of a scene, but they lose fine-grained details as they process information.

Now, MIT researchers have created a system called “FeatUp” that lets algorithms capture all of the high- and low-level details of a scene at the same time — almost like Lasik eye surgery for computer vision.

When computers learn to “see” from looking at images and videos, they build up “ideas” of what’s in a scene through something called “features.” To create these features, deep networks and visual foundation models break down images into a grid of tiny squares and process these squares as a group to determine what’s going on in a photo. Each tiny square is usually made up of anywhere from 16 to 32 pixels, so the resolution of these algorithms is dramatically smaller than the images they work with. In trying to summarize and understand photos, algorithms lose a ton of pixel clarity. 

The FeatUp algorithm can stop this loss of information and boost the resolution of any deep network without compromising on speed or quality. This allows researchers to quickly and easily improve the resolution of any new or existing algorithm. For example, imagine trying to interpret the predictions of a lung cancer detection algorithm with the goal of localizing the tumor. Applying FeatUp before interpreting the algorithm using a method like class activation maps (CAM) can yield a dramatically more detailed (16-32x) view of where the tumor might be located according to the model. 

FeatUp not only helps practitioners understand their models, but also can improve a panoply of different tasks like object detection, semantic segmentation (assigning labels to pixels in an image with object labels), and depth estimation. It achieves this by providing more accurate, high-resolution features, which are crucial for building vision applications ranging from autonomous driving to medical imaging.

“The essence of all computer vision lies in these deep, intelligent features that emerge from the depths of deep learning architectures. The big challenge of modern algorithms is that they reduce large images to  very small grids of ‘smart’ features, gaining intelligent insights but losing the finer details,” says Mark Hamilton, an MIT PhD student in electrical engineering and computer science, MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) affiliate, and a co-lead author on a paper about the project. “FeatUp helps enable the best of both worlds: highly intelligent representations with the original image’s resolution. These high-resolution features significantly boost performance across a spectrum of computer vision tasks, from enhancing object detection and improving depth prediction to providing a deeper understanding of your network’s decision-making process through high-resolution analysis.” 

Resolution renaissance 

As these large AI models become more and more prevalent, there’s an increasing need to explain what they’re doing, what they’re looking at, and what they’re thinking. 

But how exactly can FeatUp discover these fine-grained details? Curiously, the secret lies in wiggling and jiggling images. 

In particular, FeatUp applies minor adjustments (like moving the image a few pixels to the left or right) and watches how an algorithm responds to these slight movements of the image. This results in hundreds of deep-feature maps that are all slightly different, which can be combined into a single crisp, high-resolution, set of deep features. “We imagine that some high-resolution features exist, and that when we wiggle them and blur them, they will match all of the original, lower-resolution features from the wiggled images. Our goal is to learn how to refine the low-resolution features into high-resolution features using this ‘game’ that lets us know how well we are doing,” says Hamilton. This methodology is analogous to how algorithms can create a 3D model from multiple 2D images by ensuring that the predicted 3D object matches all of the 2D photos used to create it. In FeatUp’s case, they predict a high-resolution feature map that’s consistent with all of the low-resolution feature maps formed by jittering the original image.

The team notes that standard tools available in PyTorch were insufficient for their needs, and introduced a new type of deep network layer in their quest for a speedy and efficient solution. Their custom layer, a special joint bilateral upsampling operation, was over 100 times more efficient than a naive implementation in PyTorch. The team also showed this new layer could improve a wide variety of different algorithms including semantic segmentation and depth prediction. This layer improved the network’s ability to process and understand high-resolution details, giving any algorithm that used it a substantial performance boost. 

“Another application is something called small object retrieval, where our algorithm allows for precise localization of objects. For example, even in cluttered road scenes algorithms enriched with FeatUp can see tiny objects like traffic cones, reflectors, lights, and potholes where their low-resolution cousins fail. This demonstrates its capability to enhance coarse features into finely detailed signals,” says Stephanie Fu ’22, MNG ’23, a PhD student at the University of California at Berkeley and another co-lead author on the new FeatUp paper. “This is especially critical for time-sensitive tasks, like pinpointing a traffic sign on a cluttered expressway in a driverless car. This can not only improve the accuracy of such tasks by turning broad guesses into exact localizations, but might also make these systems more reliable, interpretable, and trustworthy.”

What next?

Regarding future aspirations, the team emphasizes FeatUp’s potential widespread adoption within the research community and beyond, akin to data augmentation practices. “The goal is to make this method a fundamental tool in deep learning, enriching models to perceive the world in greater detail without the computational inefficiency of traditional high-resolution processing,” says Fu.

“FeatUp represents a wonderful advance towards making visual representations really useful, by producing them at full image resolutions,” says Cornell University computer science professor Noah Snavely, who was not involved in the research. “Learned visual representations have become really good in the last few years, but they are almost always produced at very low resolution — you might put in a nice full-resolution photo, and get back a tiny, postage stamp-sized grid of features. That’s a problem if you want to use those features in applications that produce full-resolution outputs. FeatUp solves this problem in a creative way by combining classic ideas in super-resolution with modern learning approaches, leading to beautiful, high-resolution feature maps.”

“We hope this simple idea can have broad application. It provides high-resolution versions of image analytics that we’d thought before could only be low-resolution,” says senior author William T. Freeman, an MIT professor of electrical engineering and computer science professor and CSAIL member.

Lead authors Fu and Hamilton are accompanied by MIT PhD students Laura Brandt SM ’21 and Axel Feldmann SM ’21, as well as Zhoutong Zhang SM ’21, PhD ’22, all current or former affiliates of MIT CSAIL. Their research is supported, in part, by a National Science Foundation Graduate Research Fellowship, by the National Science Foundation and Office of the Director of National Intelligence, by the U.S. Air Force Research Laboratory, and by the U.S. Air Force Artificial Intelligence Accelerator. The group will present their work in May at the International Conference on Learning Representations.
Go to Source
18/03/2024 – 21:04 /Rachel Gordon | MIT CSAIL
Twitter: @hoffeldtcom

Coronavirus | The Guardian

Plaintiffs in Murthy v Missouri argue White House requests to take down coronavirus misinformation violate first amendmentThe supreme court heard oral arguments on Monday in a case that could upend the federal government’s relationship with social media companies and with lies online. Plaintiffs in Murthy v Missouri argue that White House requests to take down coronavirus misinformation on Twitter and Facebook constitute illegal censorship in violation the first amendment.The arguments began with Brian Fletcher, the principal deputy solicitor general of the justice department, making an argument that none of the government’s communications crossed the line from persuasion into coercion. He also pushed back against descriptions of events in lower court rulings, stating that they were misleading or included quotations taken out of context. Continue reading…
Go to Source
18/03/2024 – 21:04 /Nick Robins-Early
Twitter: @hoffeldtcom

Coronavirus | The Guardian

Senior detective warns children are accessing extreme material as a result of lockdowns, after a 20-year-old was jailed on MondayA senior counter-terrorism officer has warned that children and young people are increasingly being radicalised online after spending long periods on the internet during the pandemic.Det Supt Andy Meeks said a growing number of vulnerable people were accessing extreme material after spending hours unsupervised online. Continue reading…
Go to Source
18/03/2024 – 21:04 /Josh Halliday North of England editor
Twitter: @hoffeldtcom

MIT News – Artificial intelligence

Cancer Grand Challenges recently announced five winning teams for 2024, which included five researchers from MIT: Michael Birnbaum, Regina Barzilay, Brandon DeKosky, Seychelle Vos, and Ömer Yilmaz. Each team is made up of interdisciplinary cancer researchers from across the globe and will be awarded $25 million over five years. 

Birnbaum, an associate professor in the Department of Biological Engineering, leads Team MATCHMAKERS and is joined by co-investigators Barzilay, the School of Engineering Distinguished Professor for AI and Health in the Department of Electrical Engineering and Computer Science and the AI faculty lead at the MIT Abdul Latif Jameel Clinic for Machine Learning in Health; and DeKosky, Phillip and Susan Ragon Career Development Professor of Chemical Engineering. All three are also affiliates of the Koch Institute for Integrative Cancer Research At MIT.

Team MATCHMAKERS will take advantage of recent advances in artificial intelligence to develop tools for personalized immunotherapies for cancer patients. Cancer immunotherapies, which recruit the patient’s own immune system against the disease, have transformed treatment for some cancers, but not for all types and not for all patients. 

T cells are one target for immunotherapies because of their central role in the immune response. These immune cells use receptors on their surface to recognize protein fragments called antigens on cancer cells. Once T cells attach to cancer antigens, they mark them for destruction by the immune system. However, T cell receptors are exceptionally diverse within one person’s immune system and from person to person, making it difficult to predict how any one cancer patient will respond to an immunotherapy.  

Team MATCHMAKERS will collect data on T cell receptors and the different antigens they target and build computer models to predict antigen recognition by different T cell receptors. The team’s overarching goal is to develop tools for predicting T cell recognition with simple clinical lab tests and designing antigen-specific immunotherapies. “If successful, what we learn on our team could help transform prediction of T cell receptor recognition from something that is only possible in a few sophisticated laboratories in the world, for a few people at a time, into a routine process,” says Birnbaum. 

“The MATCHMAKERS project draws on MIT’s long tradition of developing cutting-edge artificial intelligence tools for the benefit of society,” comments Ryan Schoenfeld, CEO of The Mark Foundation for Cancer Research. “Their approach to optimizing immunotherapy for cancer and many other diseases is exemplary of the type of interdisciplinary research The Mark Foundation prioritizes supporting.” In addition to The Mark Foundation, the MATCHMAKERS team is funded by Cancer Research UK and the U.S. National Cancer Institute.

Vos, the Robert A. Swanson (1969) Career Development Professor of Life Sciences and HHMI Freeman Hrabowksi Scholar in the Department of Biology, will be a co-investigator on Team KOODAC. The KOODAC team will develop new treatments for solid tumors in children, using protein degradation strategies to target previously “undruggable” drivers of cancers. KOODAC is funded by Cancer Research UK, France’s Institut National Du Cancer, and KiKa (Children Cancer Free Foundation) through Cancer Grand Challenges. 

As a co-investigator on team PROSPECT, Yilmaz, who is also a Koch Institute affiliate, will help address early-onset colorectal cancers, an emerging global problem among individuals younger than 50 years. The team seeks to elucidate pathways, risk factors, and molecules involved in the disease’s development. Team PROSPECT is supported by Cancer Research UK, the U.S. National Cancer Institute, the Bowelbabe Fund for Cancer Research UK, and France’s Institut National Du Cancer through Cancer Grand Challenges.  
Go to Source
18/03/2024 – 18:02 /Bendta Schroeder | Koch Institute
Twitter: @hoffeldtcom

AWS Machine Learning Blog

Today, we are excited to announce the capability to fine-tune Code Llama models by Meta using Amazon SageMaker JumpStart. The Code Llama family of large language models (LLMs) is a collection of pre-trained and fine-tuned code generation models ranging in scale from 7 billion to 70 billion parameters. Fine-tuned Code Llama models provide better accuracy and explainability over the base Code Llama models, as evident on its testing against HumanEval and MBPP datasets. You can fine-tune and deploy Code Llama models with SageMaker JumpStart using the Amazon SageMaker Studio UI with a few clicks or using the SageMaker Python SDK. Fine-tuning of Llama models is based on the scripts provided in the llama-recipes GitHub repo from Meta using PyTorch FSDP, PEFT/LoRA, and Int8 quantization techniques.
In this post, we walk through how to fine-tune Code Llama pre-trained models via SageMaker JumpStart through a one-click UI and SDK experience available in the following GitHub repository.
What is SageMaker JumpStart
With SageMaker JumpStart, machine learning (ML) practitioners can choose from a broad selection of publicly available foundation models. ML practitioners can deploy foundation models to dedicated Amazon SageMaker instances from a network isolated environment and customize models using SageMaker for model training and deployment.
What is Code Llama
Code Llama is a code-specialized version of Llama 2 that was created by further training Llama 2 on its code-specific datasets and sampling more data from that same dataset for longer. Code Llama features enhanced coding capabilities. It can generate code and natural language about code, from both code and natural language prompts (for example, “Write me a function that outputs the Fibonacci sequence”). You can also use it for code completion and debugging. It supports many of the most popular programming languages used today, including Python, C++, Java, PHP, Typescript (JavaScript), C#, Bash, and more.
Why fine-tune Code Llama models
Meta published Code Llama performance benchmarks on HumanEval and MBPP for common coding languages such as Python, Java, and JavaScript. The performance of Code Llama Python models on HumanEval demonstrated varying performance across different coding languages and tasks ranging from 38% on 7B Python model to 57% on 70B Python models. In addition, fine-tuned Code Llama models on SQL programming language have shown better results, as evident in SQL evaluation benchmarks. These published benchmarks highlight the potential benefits of fine-tuning Code Llama models, enabling better performance, customization, and adaptation to specific coding domains and tasks.
No-code fine-tuning via the SageMaker Studio UI
To start fine-tuning your Llama models using SageMaker Studio, complete the following steps:

On the SageMaker Studio console, choose JumpStart in the navigation pane.

You will find listings of over 350 models ranging from open source and proprietary models.

Search for Code Llama models.

If you don’t see Code Llama models, you can update your SageMaker Studio version by shutting down and restarting. For more information about version updates, refer to Shut down and Update Studio Apps. You can also find other model variants by choosing Explore all Code Generation Models or searching for Code Llama in the search box.

SageMaker JumpStart currently supports instruction fine-tuning for Code Llama models. The following screenshot shows the fine-tuning page for the Code Llama 2 70B model.

For Training dataset location, you can point to the Amazon Simple Storage Service (Amazon S3) bucket containing the training and validation datasets for fine-tuning.
Set your deployment configuration, hyperparameters, and security settings for fine-tuning.
Choose Train to start the fine-tuning job on a SageMaker ML instance.

We discuss the dataset format you need prepare for instruction fine-tuning in the next section.

After the model is fine-tuned, you can deploy it using the model page on SageMaker JumpStart.

The option to deploy the fine-tuned model will appear when fine-tuning is finished, as shown in the following screenshot.

Fine-tune via the SageMaker Python SDK
In this section, we demonstrate how to fine-tune Code LIama models using the SageMaker Python SDK on an instruction-formatted dataset. Specifically, the model is fine-tuned for a set of natural language processing (NLP) tasks described using instructions. This helps improve the model’s performance for unseen tasks with zero-shot prompts.
Complete the following steps to complete your fine-tuning job. You can get the entire fine-tuning code from the GitHub repository.
First, let’s look at the dataset format required for the instruction fine-tuning. The training data should be formatted in a JSON lines (.jsonl) format, where each line is a dictionary representing a data sample. All training data must be in a single folder. However, it can be saved in multiple .jsonl files. The following is a sample in JSON lines format:

{
‘system_prompt’: ‘a chat’,
‘question’: ‘Please focus on the efficiency of this problem and provide code in python:nYou are given two strings `s` and `t` consisting of only lowercase English letters.nnReturn _the minimum number of characters that need to be appended to the end of_ `s` _so that_ `t` _becomes a **subsequence** of_ `s`.nnA **subsequence** is a string that can be derived from another string by deleting some or no characters without changing the order of the remaining characters.nn**Example 1:**nn**Input:** s = “coaching “, t = “coding “n**Output:** 4n**Explanation:** Append the characters “ding ” to the end of s so that s = “coachingding “.nNow, t is a subsequence of s ( “**co**aching**ding** “).nIt can be shown that appending any 3 characters to the end of s will never make t a subsequence.nn**Example 2:**nn**Input:** s = “abcde “, t = “a “n**Output:** 0n**Explanation:** t is already a subsequence of s ( “**a**bcde “).nn**Example 3:**nn**Input:** s = “z “, t = “abcde “n**Output:** 5n**Explanation:** Append the characters “abcde ” to the end of s so that s = “zabcde “.nNow, t is a subsequence of s ( “z**abcde** “).nIt can be shown that appending any 4 characters to the end of s will never make t a subsequence.nn**Constraints:**nn* `1 0 and r < n - 1 and s[l - 1] == s[r + 1]: l -= 1 r += 1 length = r - l + 1 if length > max_length:
start, max_length = l, length
return s[start:start + max_length]
“`

Interestingly, our fine-tuned version of Code Llama 34B Python provides a dynamic programming-based solution to the longest palindromic substring, which is different from the solution provided in the ground truth from the selected test example. Our fine-tuned model reasons and explains the dynamic programming-based solution in detail. On the other hand, the non-fine-tuned model hallucinates potential outputs right after the print statement (shown in the left cell) because the output axyzzyx is not the longest palindrome in the given string. In terms of time complexity, the dynamic programming solution is generally better than the initial approach. The dynamic programming solution has a time complexity of O(n^2), where n is the length of the input string. This is more efficient than the initial solution from the non-fine-tuned model, which also had a quadratic time complexity of O(n^2) but with a less optimized approach.
This looks promising! Remember, we only fine-tuned the Code LIama Python variant with 10% of the Dolphin Coder dataset. There is a lot more to explore!
Despite of thorough instructions in the response, we still need examine the correctness of the Python code provided in the solution. Next, we use an evaluation framework called Human Eval to run integration tests on the generated response from Code LIama to systematically examine its quality.
Quantitative evaluation with HumanEval
HumanEval is an evaluation harness for evaluating an LLM’s problem-solving capabilities on Python-based coding problems, as described in the paper Evaluating Large Language Models Trained on Code. Specifically, it consists of 164 original Python-based programming problems that assess a language model’s ability to generate code based on provided information like function signature, docstring, body, and unit tests.
For each Python-based programming question, we send it to a Code LIama model deployed on a SageMaker endpoint to get k responses. Next, we run each of the k responses on the integration tests in the HumanEval repository. If any response of the k responses passes the integration tests, we count that test case succeed; otherwise, failed. Then we repeat the process to calculate the ratio of successful cases as the final evaluation score named pass@k. Following standard practice, we set k as 1 in our evaluation, to only generate one response per question and test whether it passes the integration test.
The following is a sample code to use HumanEval repository. You can access the dataset and generate a single response using a SageMaker endpoint. For details, see the notebook in the GitHub repository.

%pip3 install human_eval
import json
from human_eval.evaluation import evaluate_functional_correctness
from human_eval.data import write_jsonl, read_problems
from tqdm import tqdm
problems = read_problems()

num_samples_per_task = 1 # value k: number of responses for each question
samples = [
dict(task_id=task_id, completion=generate_one_completion(problems[task_id][“prompt”]))
for task_id in tqdm(problems)
for _ in range(num_samples_per_task)
]
write_jsonl(“samples.jsonl”, samples)

evaluate_functional_correctness(‘./samples.jsonl’)

The following table shows the improvements of the fine-tuned Code LIama Python models over the non-fine-tuned models across different model sizes. To ensure correctness, we also deploy the non-fine-tuned Code LIama models in SageMaker endpoints and run through Human Eval evaluations. The pass@1 numbers (the first row in the following table) match the reported numbers in the Code Llama research paper. The inference parameters are consistently set as “parameters”: {“max_new_tokens”: 384, “temperature”: 0.2}.
As we can see from the results, all the fine-tuned Code LIama Python variants show significant improvement over the non-fine-tuned models. In particular, Code LIama Python 70B outperforms the non-fine-tuned model by approximately 12%.

.
7B Python
13B Python
34B
34B Python
70B Python

Pre-trained model performance (pass@1)
38.4
43.3
48.8
53.7
57.3

Fine-tuned model performance (pass@1)
45.12
45.12
59.1
61.5
69.5

Now you can try fine-tuning Code LIama models on your own dataset.
Clean up
If you decide that you no longer want to keep the SageMaker endpoint running, you can delete it using AWS SDK for Python (Boto3), AWS Command Line Interface (AWS CLI), or SageMaker console. For more information, see Delete Endpoints and Resources. Additionally, you can shut down the SageMaker Studio resources that are no longer required.
Conclusion
In this post, we discussed fine-tuning Meta’s Code Llama 2 models using SageMaker JumpStart. We showed that you can use the SageMaker JumpStart console in SageMaker Studio or the SageMaker Python SDK to fine-tune and deploy these models. We also discussed the fine-tuning technique, instance types, and supported hyperparameters. In addition, we outlined recommendations for optimized training based on various tests we carried out. As we can see from these results of fine-tuning three models over two datasets, fine-tuning improves summarization compared to non-fine-tuned models. As a next step, you can try fine-tuning these models on your own dataset using the code provided in the GitHub repository to test and benchmark the results for your use cases.

About the Authors
Dr. Xin Huang is a Senior Applied Scientist for Amazon SageMaker JumpStart and Amazon SageMaker built-in algorithms. He focuses on developing scalable machine learning algorithms. His research interests are in the area of natural language processing, explainable deep learning on tabular data, and robust analysis of non-parametric space-time clustering. He has published many papers in ACL, ICDM, KDD conferences, and Royal Statistical Society: Series A.
Vishaal Yalamanchali is a Startup Solutions Architect working with early-stage generative AI, robotics, and autonomous vehicle companies. Vishaal works with his customers to deliver cutting-edge ML solutions and is personally interested in reinforcement learning, LLM evaluation, and code generation. Prior to AWS, Vishaal was an undergraduate at UCI, focused on bioinformatics and intelligent systems.
Meenakshisundaram Thandavarayan works for AWS as an AI/ ML Specialist. He has a passion to design, create, and promote human-centered data and analytics experiences. Meena focuses on developing sustainable systems that deliver measurable, competitive advantages for strategic customers of AWS. Meena is a connector and design thinker, and strives to drive businesses to new ways of working through innovation, incubation, and democratization.
Dr. Ashish Khetan is a Senior Applied Scientist with Amazon SageMaker built-in algorithms and helps develop machine learning algorithms. He got his PhD from University of Illinois Urbana-Champaign. He is an active researcher in machine learning and statistical inference, and has published many papers in NeurIPS, ICML, ICLR, JMLR, ACL, and EMNLP conferences.
Go to Source
18/03/2024 – 18:02 /Xin Huang
Twitter: @hoffeldtcom

Everyone’s Blog Posts – RecruitingBlogs

Human resource management is among the top industries in the world and effective human resource management ultimately leads to an organization’s increased productivity and efficiency. So, this is the case with artificial intelligence. This newest technology has been transforming industries across the world by automating several tasks and enhancing accuracy, speed, and making the workplace smarter.
With the help of AI and Data Science, HR is evolving from a primarily administrative department to a more strategic one contributing to business success. By using powerful AI tools and technologies HR can streamline various processes and gain deeper insights into talent management. This can help businesses build and retain a skilled and talented workforce.
Do you know data science certifications are a great way to enhance your data science skills and validate your knowledge and expertise in this domain?
Artificial Intelligence for HR Management
First, let us understand the role of Artificial Intelligence in streamlining various HR processes and enhancing efficiency.
To begin with, we must understand and acknowledge the power of Artificial Intelligence in automating routine and repetitive tasks and freeing up HR professional’s precious time to focus on more productive work.
Here are some ways AI can be used to transform HR management:
1.  Automating screening and recruitment process
For any particular job role, HR professionals receive thousands and thousands of resumes. Finding the candidates with the right skillsets matching per job description can be a tedious and time-consuming process. But with the help of AI-powered applicant tracking systems (ATS) that can efficiently automate this task, can match candidates based on pre-defined criteria.
A recent study by Greenhouse stated that 82% of HR professionals are leveraging AI in their recruitment process and 62% said they have significantly reduced time spent on resume screening process.
Interesting fact: Data science jobs are ranked as the 5th fastest-growing jobs in the world by the World Economic Forum employment report.
2.  Making onboarding and training experience smoother
Onboarding is an important part of employee experience that plays an important role in engagement and retention. With the help of an AI chatbot, new joiners can get personalized information and answers to some of the most frequently asked questions. It will help make the onboarding experience pleasant and memorable.
On top of that, AI can easily help to personalize training programs based on analysis of employee data and ensure they gain the necessary skills on time without making the training program unnecessarily longer and boring. Personalized onboarding experiences can lead to a 50% increase in retention of new employees as found in the research by IBM.
3.  Simplifying performance management
If we look at traditional performance management systems, then the reviews were often very subjective and time-consuming. But now, AI-powered systems can easily automate data collection and analysis of performance data and help HR professionals with valuable insights for a clear evaluation.
Moreover, AI technology can also be used to offer real-time feedback and performance improvement suggestions that can encourage a culture of continuous learning and development.
Data Science for HR Management
Just like Artificial Intelligence, Data Science is another field of technology helping in transforming HR management. It helps HR professionals to switch from intuition-based decisions to data-driven decisions. Data science is used to analyze vast amounts of employee data helping HR professionals understand their workforce better.
With the help of data science certification and data science courses, you can easily pave the path to a rewarding data science career and contribute to transforming industries like HR.
Here are some ways in which data science can prove to be beneficial in HR processes:
1.  Building a data-driven HR culture
This refers to building and encouraging a culture within the organization where HR professionals welcome the use of data science to collect and analyze data and apply this to their decision-making process. They must be trained with data science skills to learn how to use and translate data into insights.
2.  Identify skill gaps and predict talent requirements
With the help of data science, HR professionals can easily forecast the future workforce requirement, as based on historical trends and business projections. So, HR professionals can easily identify the skill gaps and plan to work on addressing such situations including targeted recruitment, upskilling programs, or maybe partnerships with education or certification institutions like USDSI®.
In its recent study, McKinsey & Company found that around 800 million jobs will be lost because of automation by 2030. So, we can imagine how important skill development is now.
3.  Boosting employee engagement and retention
Employees now want more than just work and salary, so, their organizations are investing more time and money in their employee engagement strategies as it ultimately boosts their productivity. Lately, data science has been used to help HR professionals understand employee sentiment by analyzing huge employee data gathered from surveys, performance reviews, and internal communication channels.
Data Science helps to identify potential issues that impact employee engagement so that HR professionals can work on addressing them and improve employee satisfaction rates. Do you know, that engaged employees are 21% more profitable than less engaged employees (as mentioned in the study by Gallup)?
Conclusion
AI and Data Science have been revolutionizing all industries and HR is not an exception. With the right use of technology, HR processes can be easily streamlined to boost productivity and efficiency. However, organizations must take into consideration the ethical consequences and address the challenges related to the use of AI and Data science for HR including bias in algorithms, and data privacy. As we move ahead toward a phenomenal future, AI and Data Science will have a more important role in transforming HR operations. So, let’s wait and see what the future unfolds.
Go to Source
18/03/2024 – 12:02 /Lucia Adams
Twitter: @hoffeldtcom

Coronavirus | The Guardian

Figures show many still ‘out of the habit’ of visiting museums, galleries, cathedrals, castles and country housesVisitor numbers to the UK’s museums, galleries, cathedrals, zoos, castles and country houses are increasing but remain stubbornly below pre-pandemic levels, with a significant number of people still “out of the habit” of having a day out.Figures released by the Association of Leading Visitor Attractions (ALVA) on Monday show a mixed picture. On the bright side, there was a 19% increase in visitor numbers in 2023 compared with 2022. The British Museum saw a 42% rise, making it the most visited attraction in the UK. Continue reading…
Go to Source
18/03/2024 – 09:02 /Mark Brown
Twitter: @hoffeldtcom

error: Content is protected !!