AI

Auto Added by WPeMatico

This website lets you merge photos with video and audio to make them talk

After Deep Nostalgia, here’s another tool that lets you animate your still images. Tokkingheads lets you choose a photo or avatar, merge it with a video to copy the moves from it, and even add audio to your creations. It’s a bit less impressive than Deep Nostalgia, but more fun and interactive… and I got […]

The post This website lets you merge photos with video and audio to make them talk appeared first on DIY Photography.

Take a Trip Back in Time Via Colorized Footage From a Day in 1920s Paris

Travel back in time to Paris during the roaring ’20s featuring flappers, bobbed hair, and cloche hats in this short 2-minute archival video colorized by Glamourdaze. The video was created using artificial intelligence to restore the footage and add color.

This short clip is an edit from a series of travelogues by early filmmaker Burton Holmes, who himself coined the term “travelogue.” According to Glamourdaze, Holmes gave lectures and included slides and motion pictures of his trips, with this being one such example. The original footage can be viewed in its original state below.

To make a direct comparison between the finished, colorized, 4K footage uploaded by Glamourdaze and the original source material, the footage starting at 0:28 of Glamourdaze’s video can be found at 12:12 of the original archival footage.

To create this particular video, Glamourdaze writes that first DeNoise was applied and artifacts removed, followed by an increase in motion interpolation to 60 fps using a deep learning open source program Dainapp. Next, the footage was upscaled using AI to 4K, and finally, color was added using Deoldify.

This finished footage is much smoother than the original thanks to the 60 frames-per-second interpolation. While less “cinematic” than the original, the result is slightly more lifelike motion. The original footage has significant artifacting and low resolution which has been fixed, but it is imperfect. Many of the individual frames look “mushy” or “hazy.” Coloration also can be inconsistent when looked viewed on a frame-by-frame basis, but the finished product is still pretty good considering the quality of the source material.

These issues are part of the reason some historians are asking people like Glamourdaze to stop upscaling and colorizing historical footage.

“It is a nonsense,” Luke McKernan, the lead curator of news and moving images at the British Library, told Wired last year. “Colourisation does not bring us closer to the past; it increases the gap between now and then. It does not enable immediacy; it creates difference.”

“The problem with colourisation is it leads people to just think about photographs as a kind of uncomplicated window onto the past, and that’s not what photographs are,” another historian argued.

Still, the outcries of a few historians have not slowed the interest in colorized footage. This particular short video from Glamourdaze has already amassed more than a million views on YouTube since it was uploaded at the end of January, and many of the other videos on the channel have millions more.

If you would like to see more peeks into a colorized past, you can subscribe to Glamourdaze’s YouTube Channel.


(via Laughing Squid)

How Much of a Photo Can Be Deleted Before AI Can’t Recognize It?

In a new project that mixes science and art, artistic duo Shinseungback Kimyonghun has created a series of images that have pixels removed until an AI program can no longer recognize the subject — in this case mountains. Impressively, much of the image can be deleted before this happens.

“Shinseungback Kimyonghun” is a Seoul-based artistic duo consisting of engineer Shin Seung Back and artist Kim Yong Hun. The two have many projects that mix AI visual recognition with photography. In one from 2012 called Cloud Face, the two had an AI look at moving clouds in an attempt to pull frames that looked like human faces. In another project called Flower, the AI was shown a series of distorted flower images which could still be recognized by the AI.

In a similar vein to Flower, this latest project, titled Mou ta n, examines the limits of AI’s current object recognition capabilities.

“An unfinished painting by Paul Cézanne, Still Life with Water Jug inspired the project,” Yong Hun tells PetaPixel. “Although almost half of the painting is unpainted, we can still see the objects. Would it be visible to AI as well? If so, how much should it be erased to be unrecognizable to AI? What would it look like to humans then? These questions arose and we applied the idea to mountain images.”

Yong Hun says that he and Back used three different object recognition systems for this project: Google Cloud Vision, detectron2 by Facebook, and Microsoft Azure Computer Vision. When none of the three AI systems can see the mountains in the images, the duo considered the piece complete.

Looking at the mostly-deleted images, it is rather impressive that the AI systems were able to see these as mountains for so long: some of these are nearly indistinguishable even to the human eye. Yong Hun and Beck agree.

“We had to erase more than we expected,” Yong Hun says.

For the team, finding a balance to where the AI wouldn’t be able to recognize it but a human still would was part of the challenge.

“We wanted the erased mountains to still be visible to humans. This boundary was difficult to find. If we erase only a little, the AI sees the mountain. If we erase a lot, humans cannot see the mountain either,” Yong Hun said.

The portions they chose to erase had to be picked in such a way that humans could still recognize them. But Yong Hun says that if they wanted to, they could have deleted so much of the photo that a human wouldn’t be able to see what it was, but an AI could still recognize it.

“The images were erased in the way that humans (at least myself) can see the mountains, but it could have been erased for AI can see but humans can’t,” he says.

The duo believes that as AI gets smarter, it will be harder and harder to create images that humans can still recognize but AI cannot. Part of what they found in this process was that in many cases, AI showed that it may be better at recognition of objects than humans are, and this is especially the case as AI gets smarter.

While certainly interesting as art, this project does point to the power of artificial intelligence. With image recognition and facial recognition getting particularly powerful, it will be part of human society’s choices in the near future to determine the limits of how the technology can be deployed, what it can be used on, and how governments should be expected to use or not use it.

To see Shinseungback Kimyonghun’s full project, check it out here.


Image credits: Photos by Shinseungback Kimyonghun and used with permission.

Researchers Release Improved 3D Models Made From Tourists’ Photos

Artificial Intelligence and machine learning advancements have allowed researchers to build detailed 3D models of real-world locations by using the reference data of thousands of tourists’ photos. The finished models have cleanly removed unwanted objects and even normalized lighting conditions.

The project and associated paper are titled Neural Radiance Fields for Unconstrained Photo Collections and was originally published in August of 2020. The project was recently updated with even more examples of its application, a deep-dive video explanation of how the program works, and published findings that take the idea of converting 2D to 3D a step further.

To recap, researchers used a photo tourism data set of thousands of images to produce highly-detailed models of iconic locations.

“You can see that we are able to produce high-quality renderings of novel views of these scenes using only unstructured image collections as input,” the researchers say.

“Getting good results from uncontrolled internet photos can a challenging task because these images have likely been taken at different times, ” the researchers explain. “So the weather might change or the sun might move. They can have different types of post-processing applied to them. Also, people generally don’t take photos of landmarks in isolation: there might be people posing for the camera, or pedestrians or cars moving through the scene.”

The project is a learning-based method for synthesizing views of complex scenes using only in-the-wild photographs. The researchers — Ricardo Martin-Brualla, Noha Radwan, Mehdi S. M. Sajjadi, Jonathan T. Barron, Alexey Dosovitskiy, Daniel Duckworth — built on the Neural Radiance Fields (NeRF), which uses different perspective data to model the density and color of a scene as a function of 3D coordinates.

The original NeRF program was unable to model images based on real-world situations or uncontrolled image sets, as it was only designed originally for controlled settings. These researchers decided to tackle that particular weakness and enabled accurate reconstructions from completely unconstrained image collections taken from the internet.

“We apply our system, dubbed NeRF-W, to internet photo collections of famous landmarks, and demonstrate temporally consistent novel view renderings that are significantly closer to photorealism than the prior state of the art,” the team writes.

“NeRF-W captures lighting and photometric post-processing in a low-dimensional latent embedding space. Interpolating between two embeddings smoothly captures variation in appearance without affecting 3D geometry,” the team explains. “NeRF-W disentangles lighting from the underlying 3D scene geometry. The latter remains consistent even as the former changes.”

The model of the Brandenburg Gate above is extremely high resolution. While available to preview above, it can also be seen in both Full-HD as well as QHD, or 1440p.

The advancements on this from the original have resulted in much better, less noise-filled 3D models that are far superior to the original Neural Renderings in the Wild from last year. Below are a couple of still capture examples, but the benefits of this latest advancement are clearer when seen in motion via the video above.

The video explanation of how this program works is fascinating to anyone working in the advancement of artificial intelligence. What these researchers have done is extremely impressive, and it will be interesting to see what possible applications of this technology come in the future. You can learn more about the project and technology here.

Samsung is Bringing Some Galaxy S21 Features to Older Devices

Samsung is bringing several of the new capabilities that came with the Galaxy S21 Ultra and its One UI OS to its older phones, including the Galaxy S20, Note 20, Z Fold 2, and Z Flip devices. While not all the features in its latest One UI Android skin are coming, many camera features are.

As part of the One UI 3.1 update, several new features are soon going to be available on legacy Samsung devices.

The ability to capture both stills and video simultaneously with the phone’s cameras, which Samsung calls Enhanced Single Take, is one such feature coming to older devices along with Multi Mic Recording, which lets you capture audio both from your phone and from a connected Bluetooth device. These features are aimed more towards creators using their Samsung devices to make more complicated photo and video projects, and while that audience is small at the mid to high-end consumer level, having these capabilities in a smartphone makes it easier and more accessible for new or burgeoning artists to try their hands at more complicated setups.

In the Samsung Gallery app, the Object Eraser tool that can remove visual elements from photos with the help of AI is also coming to older devices. This feature was touted as working similarly to Adobe’s Content-Aware Fill tool, and Samsung boasted in its Galaxy S21 Ultra reveal that it was capable of removing people from backgrounds with a single tap. In practice, it is reported that actual results may vary depending on how prominent the object you are trying to remove is, which is honestly expected; there are limitations to what this kind of thing can do.

AI removes those elements and intelligently fills them.

Finally, Eye Comfort Shield which limits the amount of blue light that your phone outputs, and Private share, which allows you to add an expiration date to files you send to others, are both also coming to legacy devices as part of this update. Both of these features fall into the “nice to have” category, and the file-sharing feature, in particular, should make using Samsung phones for business activities more secure and easier to monitor.

Samsung will start rolling out the One UI 3.1 update starting today in “select markets.”

(via Engadget)

AI Can Now Turn You Into a Fully Digital, Realistic Talking Clone

Hour One describes itself as a “video transformation company” that wants to replace cameras with code, and its latest creation is the ability for anyone to create a fully digital clone of themselves that can appear to speak “on camera” without a camera or audio inputs at all.

The company has debuted its digital clone technology in partnership with YouTuber Taryn Southern. In the video above, Southern is a fully digital creation that was created as a collaborative experiment between Southern and Hour One. The company uses a proprietary AI-driven process to provide automation to video creation, which enables presenter-led videos at scale without needing to put a person in front of a camera.

Hour One says that experts (which are not cited) predict that in the next five to seven years, 90% of content will be synthetic, or generated using computers instead of cameras. The company believes that issues arising from the Coronavirus Pandemic have exacerbated the need for this technology and fast-tracked it.

“When the pandemic hit, production all over the world shut down. People were looking for alternate ways to make content and I was curious about what could be produced with AI-generated video,” says Southern. “Experimenting with AI video production has been similar to working with AI music. It provokes important conversations around the future of identity and trust, and will undoubtedly change the future of production.”

In order to create the “AI Clone,” Southern had to go into a studio and stand in front of a green screen so she could be captured from multiple angles. She also had to say several sets of words so that the program would be able to replicate her voice. In the video below, she describes the process as just reading a couple of scripts and singing a song. The entire process in front of the camera took just seven minutes.

From there, hundreds of videos can be generated in a matter of minutes just by submitting text to the platform. A creator would not need to record any audio at all.

On the plus side, it doesn’t look like it would be possible to create an AI person without this studio time, but it also means that it would theoretically be possible to obtain the AI version of Southern and input any texts into the program which the AI would read as though it were her. The ramifications of that are daunting.

Still, Hour One argues that the benefits of its technologies outweigh the possible downsides. The company claims that with this technology, content creators will see a drastic reduction in the time and cost of video production to a matter of minutes. Additionally, a video can be created without a time-intensive routine to look presentable for the camera (AI Taryn jokes that she can now create new YouTube videos “without the real Taryn having to shower or leave her bed.”).

Additionally, any AI clone can speak multiple languages which allow for greater distribution of content to more people around the world.

It is important to distinguish this technology from a “deepfake.” Deepfakes take a target face and overlay it on top of existing or newly-recorded footage. What Hour One is doing here is allowing for completely original content to be created as though it were being spoken by the real person. Hour One is calling the result a “photoreal digital human.”

While this process may not lend itself to all types of content (like comedy, for instance, which relies heavily on performance and timing), Hour One argues it could be highly effective for news formats, for which the focus is on timeliness and quality of writing and reporting, and other kinds of presenter-led content.

“In our increasingly virtual work environment, Hour One’s technology is also being applied to e-learning, e-commerce, and digital health – places where a human presenter is highly valuable,” the company says.

Hour One’s photoreal digital human technology is rolling out now, with multiple examples available on its website. While the early iteration of the technology may look slightly short of truly real, it is quite close. Hour One will likely iterate and improve on this design in the months and years ahead.

Hour One is accepting applications for anyone to become an “Hour One character” on its website as well, for those who want to submit for consideration.

Canon jumps on the AI bandwagon with its free photo culling app

Canon has just launched a new app that helps you select the best among all those phone snaps. Like some attempts before it, Canon is relying on AI which should help ditch the “bad” photos and keep the “good” ones. But can AI really recognize what’s worth keeping and what should be discarded? Let’s try […]

The post Canon jumps on the AI bandwagon with its free photo culling app appeared first on DIY Photography.

Samsung Exec Envisions Future Where Photos Are Customized to the Shooter

Have you ever wondered why different manufacturers’ smartphone images have a certain “look” to them? In an interview with Engadget, Vice President and head of Samsung Mobile’s visual software R&D Joshua Sungdae Cho explains how images are rendered now will change going forward.

According to Engadget, it’s hard to find a review of Samsung’s latest flagship, the Galaxy S21 Ultra, that doesn’t call out the way the company chooses to render images: fine details are heavily sharpened – to a degree that may consider overkill – colors are pumped, and auto smoothing computationally “airbrushes” faces. For folks who are hobbyists and professionals in the photography space, these choices are bad individually but near sacrilegious when they happen all at the same time.

So why does Samsung do this? According to the interview, Cho says the company isn’t the one responsible for this kind of processing: the users are.

Samsung apparently doesn’t have an internal team that determines what images should look like, rather it uses crowdsourced data to determine what the average person believes looks best in an image. They do this by actually holding focus groups with users from around the world and ask them what about images they like and what could be improved. The goal is to try and pinpoint what aspects of an image the company can focus on to appeal to the most users the most amount of time.

The conversations can get pretty granular, though. In trying to suss out what people like about their favorite images, Cho says discussions can veer toward subjects like color tone and saturation, noise levels, sharpness of details, overall brightness and beyond, all so Samsung can tune its HDR models and its smart scene optimizer to deliver what he calls “perfectly trendy” photos.

Trying to please everyone all of the time is an impossible task, but Samsung seems intent on trying to do it.

That said, Cho seems to understand this method isn’t a sustainable practice and envisions a future where the same photo taken by multiple people will look different for each of them, with AI able to know exactly how to best please each individual.

“When there are ten people taking a picture of the same object, I want the camera to provide ten different pictures for each individual based on their preference,” Cho says.

The limiting factor to this future is, unfortunately, neural processing unit (NPU) technology. Right now, chip makers like Qualcomm are only in their third iteration of NPU technology, which is relatively young. For Cho, this is still the “initial stage” of development. But with the right investment into research and development of NPUs, it’s very likely that your Samsung phone will take photos that look nothing like the ones your friend takes on theirs.

Cho explains that the right neural engine could look at a person’s album to determine what they saved and what they deleted, examine the filters they use, and track trends with how the image might be edited.

“Those are some of the things that we could look at in order to ensure the system learns about the user,” he says.

For now, it’s unclear how close or far away that future is, and Cho wouldn’t say. What is clear is that chip makers have a way to go in order to keep up with the dreams and ambitions of people like Cho.

Engadget’s full interview is definitely worth your time if you’re interested in the future of computational photography and how a company envisions the future of smartphone image-making. You can read it here.


Image credits: Header photo by Dan Smedley.

First Luminar AI update brings even simpler editing and major bug fixes

Luminar AI was first introduced in September 2020, and shortly after, it got reflections in sky replacement. But now the program has launched its first update. It has made the already simple editing even simpler, and some of the major bugs have been fixed. So, let’s see what’s new in the Luminar AI 1.0.1. Interface […]

The post First Luminar AI update brings even simpler editing and major bug fixes appeared first on DIY Photography.